Constrained interleaving for 5G wireless and optical transport networks

ABSTRACT

The present invention provides a design framework that is used to develop new types of constrained turbo block convolutional (CTBC) codes that have higher performance than was previously attainable. The design framework is applied to design both random and deterministic constrained interleavers. Vectorizable deterministic constrained interleavers are developed and used to design parallel architectures for real time SISO decoding of CTBC codes. A new signal mapping technique called constrained interleaved coded modulation (CICM) is also developed. CICM is then used to develop rate matching, spatial modulation, and MIMO modulation subsystems to be used with CTBC codes and other types of codes. By way of example, embodiments are primarily provided for improved 5G LTE and optical transport network (OTN) communication systems.

This application is a continuation of co-pending U.S. patent applicationSer. No. 14/545,588, entitled “Constrained interleaving for 5G wirelessand optical transport networks,” filed May 27, 2015.

BACKGROUND OF THE INVENTION

Field of the Invention

The present invention relates generally to methods, apparatus andsystems for communication encoders, decoders, transmitters, receiversand infrastructure and/or user devices. More particularly, aspects ofthe invention relate to constrained turbo block convolutional codes,constrained interleaving, and related methods, apparatus, and systemsfor improved constrained interleaving, encoding, decoding, signalmapping, MIMO applications, spatial modulation, and rate matching. Thepresent invention also relates to efficient parallel ASICs and VLSIarchitectures and optical integrated circuit architectures to implementthese methods, apparatus, and systems.

Description of the Related Art

A large body prior art includes of technical publications, patents, andstandards that relate to 4G LTE (fourth generation long term evolution)wireless systems. In particular, the relevant prior art relates toencoding and decoding architectures and algorithms for use with the CTC(convolutional turbo code) specified for use with 4G LTE. Specifically,important prior art relates to algorithms and high performance ASICarchitectures for CTC encoding/decoding, deterministic contention-freeinterleavers such as the QPP (quadratic polynomial permutation) basedinterleavers, and rate matching/puncturing architectures.

A parallel decoding ASIC for the CTCs used in 4G LTE can be found in C.Studer, C. Benkeser, S. Belfanti, and Q. Huang, “Design andImplementation of a Parallel Turbo-Decoder ASIC for 3GPP-LTE,” IEEE J.Solid State Circuits, Vol. 46, No, 1, January 2011 (referred to as the“Studer” reference” herein). A follow on paper explains moreimprovements and details about efficient parallel decoding of the CTCused in 4G LTE. This second technical publication is: C Roth, S.Belfanti, C. Benkeser, and Q. Huang, “Efficient parallel turbo-decodingfor high throughput wireless systems,” IEEE Transactions on Circuits andSystems, 2012 (referred to as the “Roth reference” herein).

One of ordinary skill in the art would be familiar with the Studerreference which explains how to design a highly optimized parallel realtime ASIC designed to implement the CTC specified for use in 4G LTE. TheRoth reference provides further details and optimizations to the samearchitecture as described in the Studer reference. One of ordinary skillin the art would also be familiar with the following prior art referenceas well: A. Nimbalker, Y. Blankenship, B. Classon, T. K Blankenship,“ARP and QPP Interleavers for LTE Turbo Coding,” WCNC 2008 proceedings,(referred to as the “Nimbalker reference” herein). The architecture inthe Studer reference uses the QPP interleaver as described in theNimbalker reference. The QPP interleaver is important because it is usedin the 4G LTE standard and because it can be described as a “contentionfree” “vectorizable” and “deterministic” interleaver.

As is well known “contention free”/“vectorizable” means that thepermutation function has a particular property that aids in parallelprocessing implementations. Consider a case where there are N=8 parallelprocessors. Then, as long as N divides the frame size, K, the contentionfree interleaver places on a given row in memory all of the N elementsto be processed by the N processors in a given clock cycle. The QPP onlysupports up to N=8 level vectorization.

The Studer reference also points out a very efficient way to compute theQPP address sequence. As per the Nimbalker reference, the QPPinterleaved address sequence can be written as

π_(QPP)(i)=(f ₁ i+f ₂ i ²)mod K  (1)

where f₁ and f₂ are suitably chosen interleaver parameters that dependon the code-block size K. Note that in this notation the sequentiallyincremented symbol i is used to denote a coded bit position in thetransmitted frame, and the permuted version of the indexing sequence,π_(QPP)(i), is used to look up a bit position in the non-permutedsequence of input bits. The Studer reference explains a very efficientway to compute equation (1) is to use the following set of recursionswhich can be easily implemented in hardware. The recursions below onlyuse additions and modulo operations which can be very efficientlyimplemented in hardware. Hence at runtime, in hardware, equation (1) iscomputed as

π_(QPP)(i+1)=(π_(QPP)(i)+δ(i))mod K  (2)

and

δ(i+1)=(δ(i)+b)mod K  (3)

where π_(QPP)(0)=0, δ(0)=f₁+f₂, and b=2f₂.

Another prior art reference that is known to those of skill in the artand that goes into further detail about QPP recursions is: Y. Sun and J.Cavallaro, “Efficient hardware implementation of a highly-parallel 3GPPLTE/LTE-advanced turbo decoder,” Integration, the VLSI Journal, No, 44,2011, pp 305-315, (referred to as the “Sun reference” herein). Thisreference provides additional recursions that allow QPP addresses to beincremented by an integer, d=Δi, that can be any positive integer. Thisallows forward and backward sequences of QPP addresses to be generatedfor forward and backward recursions used in decoding. Also, this allowsrecursions similar to equations (2)-(3) to increment by more than oneelement, for example, Δi=K/M, where K is the frame size and M is thenumber of processors in a system. The Sun reference also explains theprior art knowledge that a set of M different QPP address generators canbe run in parallel with relative offsets of one and with Δi=K/M togenerate a set of M consecutive QPP addresses in parallel. The Sunreference also provides efficient hardware circuits to implement such anaddressing scheme.

Another relevant field of art is called rate matching. Rate matching isalso known as “puncturing.” The CTC mother code defined in the LTEstandards is a rate 1/3 parallel concatenated turbo code. This CTC leadsto very complicated rate matching circuits at both the encoder and thedecoder, thus increasing over all hardware complexity of the 4G LTE CTCencoding and decoding. A reference that discusses rate matching for LTEturbo codes is C. Ma and P. Lin, “Efficient implementation of ratematching for LTE codes,” IEEE ICFCC 2010 international conferenceproceedings, pp. V1-704-708 (referred to as the “Ma reference” herein).FIG. 1 of the Ma reference shows the basic configuration of 4G LTE ratematching at the transmitter side. The data stream plus two streams ofparity bits from the rate 1/3 parallel concatenated CTC pass throughthree parallel blocks labeled “sub-block interleaver.” That is, threeinterleavers are used, one each to process the total number of bits in anon-punctured frame. Another reference that explains the rate matchingused in 4G LTE is L. Yu et al., “An improved rate matching algorithm for3GPP LTE Turbo code,” Conference on Communications and Mobile Computing(CMC), pp. 345-348, April 2011. FIG. 2 of this article and thediscussion thereof is very helpful in understanding the 4G LTE ratematching algorithm.

There also exists a vast body of literature related to OTN (opticaltransport network) applications. OTN applications are demanding becausethey require very high data rates and powerful codes and the frame sizeused in coding/decoding is long, (122,368 message bits plus codingoverhead bits). OTN systems are either already available or still beingresearched and developed to support data rates of 100 GBPS (usuallyreferred to as 100G), 400 GBPS and even up to 1000 GBPS (1 Terabit persecond, 1 T). These very high speed systems demand very powerful codesto achieve specified high NCGs (net coding gains) at very low BERs (biterror rates) below 10⁻¹⁵. High speed digital hardware that employsextensive parallel processing is needed to decode these powerful codesin real time.

It can be noted that in OTN applications, the codes beingused/considered now correspond to LDPC (low density parity check) codes,concatenations of LDPC codes with one or more long block codes, or TPCs(turbo product codes). OTN applications cannot use CTCs like LTE doesbecause the error floors required by OTN applications are far belowthose afforded by CTCs. Hence it would be desirable to have a much lowercomplexity parallel coding/decoding technique and parallel architecturethan those that are currently proposed for use in or used in the OTNfield. It would be desirable if this low complexity coding/decodingtechnique could meet the stringent NCGs requirements at BERs of 10⁻¹⁵and outperform all known coding/decoding techniques that are currentlyproposed for use in or used in the OTN field.

The prior art also includes U.S. Pat. No. 8,537,919 “Encoding anddecoding using constrained interleaving,” and its continuation-in-part,U.S. Pat. No. 8,532,209, “Methods, apparatus and systems for coding withconstrained interleaving, and both of these US patents are incorporatedherein by reference in order to provide the reader with writtendescription level details of known constrained interleaver designtechniques, and known encoder/decoder structures that use constrainedinterleaving. These patents are incorporated by references, but it is tobe understood that for claim construction purposes, the instant writtendescription should be used, and not any of the written description inthe incorporated-by reference patents. In this patent application, someterms are defined differently than the US patents incorporated byreference herein. Therefore, it is to be understood that theinterpretation of terms and phrases used in the claims herein should betaken in the context of the present application and not the referencesincorporated herein. The prior art also includes J. Fonseka, E. Dowling,S. I. Han and Y. Hu, “Constrained interleaving of serially concatenatedcodes with inner recursive codes,” IEEE Communications Letters, Vol. 17,No. 7, July 2013, referred to herein as “the Fonseka [1] reference.” Theprior art also includes J. Fonseka, E. Dowling, T. Brown and S. I. Han,“Constrained interleaving of turbo product codes,” IEEE CommunicationsLetters, vol. 16, 2012, pp. 1365-1368, September 2012, referred toherein as “the Fonseka [2] reference.” The prior art also includes S. I.Han, J. P. Fonseka and E. M. Dowling, “Constrained Turbo BlockConvolutional Codes for 100G and Beyond Optical Transmissions,” IEEEPhotonics Technology Letters, Vol. 26, No. 10, May 2014, referred toherein as “the Fonseka [3] reference.” The above-listed patents andtechnical publications also cite to related articles in the technicalliterature and to other U.S. Patent references, which are also part ofthe prior art. It can be noted that the above referenced patents andtechnical papers constitute at least a portion of what would be known toone of skill in the art of CTBC (constrained turbo block convolutional)codes.

Consider FIG. 1, which corresponds to FIG. 4 in U.S. Pat. Nos. 8,537,919and 8,532,209. FIG. 1 shows an encoder structure that can represent amethod and/or an apparatus for encoding in accordance with CTBC code.The CTBC encoder embodiment of FIG. 1 makes use of an outer block code(OBC) encoder 405, that encodes in accordance with a selected OBC. Forexample the OBC can be a (n,k) block code, B, where n>k and n,k arepositive integers. The message bit stream at the input can be consideredto be a sequence of k-bit blocks consisting of message bits. Each k-bitmessage block is first processed by the OBC encoder 405 which, in theexemplary embodiment of FIG. 1, encodes according to an (n,k) outer codewith minimum Hamming distance (MHD) given by MHD=d₀. In some embodimentsthe outer code 405 can perform outer encoding in accordance other typesof fixed-length codes, such as a finite-length convolutional code or anLDPC code, for example. A characterizing feature of the embodiment ofFIG. 1 is that it also makes use of an inner recursive convolutionalcode (IRCC) encoder 415 that encodes its input bit stream in accordancewith an inner recursive convolutional code (the selected IRCC). Anappropriate IRCC is chosen to have an MHD given by MHD=d_(i). Forexample, the IRCC, could be selected to be the rate-1 accumulator givenby G (D)=1/(1+D). Another specific example of an IRCC is to use therate-1 accumulator followed by a (λ,λ−1) SPC encoder (or any other blockcode), a finite-length (finite impulse response) convolutional code, orany other recursive convolutional code (RCC). The value of λ can bechosen to provide design flexibility to chose the IRCC to fine tune therate and/or the d_(i) value to design a CTBC code to meet a particularset of design specifications. In some embodiments, the CTBC code isdesigned using the rate-1 accumulator as the IRCC, but this CTBC code isthen followed by another block code like the (λ,λ−1) SPC encodermentioned above.

Another characterizing feature of the CTBC encoder 400 is that it makesuse of a constrained interleaver 410. Any specific CTBC code is definedin terms of the specifically selected outer block code B used in the OBCencoder 405, the specifically selected recursive convolutional code(RCC) used in the IRCC encoder 415, and a specifically selectedconstrained interleaver having a specified size and permutation functionused in block 410. The constrained interleaver 410, and various forms ofits interleaver constraints are described in the above-cited prior artreferences. The constrained interleaver 410 can be designed to providean interleaver gain, G similar to uniform interleaving, but also can bedesigned to ensure that the net MHD of the entire CTBC code satisfiessome target MHD, d_(t)≧d₀d_(i). It can be noted that if the constrainedinterleaver used in the CTBC were to be replaced by a uniforminterleaver of the same length, a “Uniform-interleaved Turbo BlockConvolutional” (UTBC) code would result, and the MHD of thiscorresponding UTBC code would typically be close to MHD, d_(t)=d_(i).

Various forms of constrained interleavers are defined in theabove-referenced US patents and the three above-cited references relatedto constrained interleaving. A constrained interleaver type 2, i.e., the“CI-2” is introduced and used in the block 401 of FIG. 1. Theabove-referenced US patents teach how CI-2 interleaver constraints canbe defined to design the constrained interleaver 410 to enforce theproperty MHD, d_(t)≧d₀d_(i). In U.S. Pat. No. 8,532,209, the term andnotation “Constrained interleaver type 2” and its abbreviation “CI-2”are introduced. In the Fonseka [2] reference, it is shown that CI-2s canbe designed to achieve a specified target MHD that satisfiesd₀d_(i)≦MHD≦d₀ ²d_(i). CI-2s use inter-row constraints in order toachieve this. Note that the constrained interleaver block 410 in FIG. 1is labeled “r×ρn constrained interleaver.” This is because, as discussedin the above-referenced US patents, the constrained interleaver'spermutation function is designed using a r×ρn row-column matrixstructure. That is, the prior art relies upon the CI-2 design matrix,[A]_(r×ρn) and requires certain relations to hold for coded bitpositions from different codewords of the OBC that are loaded into[A]_(r×ρn). In the Fonseka [1] reference, the symbol for the number ofrows of the CI-2 design matrix was changed to the symbol, “L,” and theCI-2 design matrix is thus written as of [A]_(L×ρn). In the rest of thispatent application, from here forward, the symbol L will be used torefer to the number of rows in the CI-2 design matrix.

An objective of the CI-2 interleaver is to create CTBC codes thatsimultaneously provide a specified high MHD while achieving as high ofan interleaver gain as possible.

The high MHD provides a lower error floor and has other desirableeffects in various types of channels, and the high interleaver gainensures a high coding gain for the CTBC code. However, the interleavergain attainable by the CI-2 is limited to a large extent by the numberof rows, L in the CI-2 design matrix. The lower the number L, for afixed frame size K, the higher the CI-2 interleaver gain. However, whenCI-2 interleavers are used, lowering L will eventually limit theachievable MHD.

It would be desirable to have improved constrained interleavers that donot require a CI-2 design matrix, but instead use L=1, and can thus leadto improved CTBC codes that have higher interleaver gains as compared toa CI-2 interleaver of the same length. It would be desirable to furtherinclude improved signal mapping methods, apparatus and systems to map aCTBC code onto a target signal constellation in such a way as to providea constellation mapping gain, similar to the kinds of gains provided bytrellis coded modulation (TCM) and bit interleaved coded modulation(BICM). It would also be desirable to have new rate matching algorithmsthat could efficiently interoperate with these new and improved CTBCcodes and signal mapping subsystems. It would also be desirable to havealgorithms developed for applications in multiple input multiple output(MIMO) systems and spatial modulation and subsystem for usecommunications devices that include in multi-antenna subsystem.

Next consider FIG. 2, which corresponds to FIG. 5 in U.S. Pat. Nos.8,537,919 and 8,532,209. FIG. 2 shows a prior art receiver method andapparatus for a receiver 500 used to receive and decode a signal r(t)which was generated in accordance with FIG. 1 or a version of a serialconcatenated code whose inner coded is a block code that is alsodiscussed in U.S. Pat. Nos. 8,537,919 and 8,532,209. It is important tonote that when CTBC codes as generated using an IRCC as shown in FIG. 1herein, block 510 and the connection between block 510 and 525 will bemissing. The block 510 is only used to decode coded signals generated byan alternative embodiment shown in FIG. 2 of the above two referencedpatents. So herein, block 510 and should be ignored.

Block 1105 processes or otherwise demodulates a received signal r(t) togenerate an initial vector r_(S), which preferably corresponds to avector of bit metrics. The bit metrics are preferably used in decodingof the component codes using an a-posteriori probability (APP) decodingtechnique.

The IRCC soft in soft out (SISO) decoder 515 can implement a well knownsoft decoding algorithm such as the BCJR algorithm, or a soft outputViterbi algorithm (SOYA), the min sum algorithm. Such algorithms areknown to generate extrinsic information indicative of the reliability ofthe soft decoded results. The BCJR algorithm can be embodied using anyof the MAP, Log-MAP, or the Max-Log-Map algorithms. For example, if theIRCC SISO decoder 515 involves the BCJR algorithm, then the IRCC SISOdecoder 515 will need to compute a sequence of branch transitionprobabilities, γ's, that each are a function of a respective element ofthe received signal metrics, r_(s), and a corresponding respectiveelement of updated or initial extrinsic information, the L₃'s. The IRCCSISO decoder 515 will use this sequence of branch transitionprobabilities, γ's, while making one forward recursion pass to update aset of state metrics, α's, and one backward recursion pass algorithm toupdate a set of state metrics, β's. Such concepts are well known in theart in the context of decoding convolutional turbo codes (CTCs). Usingthe calculated α's, β's and γ's values, the BCJR decoding of the IRCCdecoder calculates the extrinsic information of all its input bits. Forexample, see P. Robertson, et al., “A comparison of optimal andsub-optimal MAP decoding algorithms operating in the log domain,” IEEEICC 1995, pp. 1009-1013.

The IRCC SISO decoder 515 couples its extrinsic information output to aconstrained deinterleaver 520 which deinterleaves the extrinsicinformation received from the IRCC SISO decoder 515, for example, inaccordance with the inverse CI-2 permutation function. The OBC SISOdecoder 525 is coupled to receive the deinterleaved extrinsicinformation from the constrained deinterleaver 520. The OBC SISO decoder525 also preferably implements a known soft decoding algorithm such asthe well known Chase-Pyndiah algorithm (also referred to as the Pyndiahalgorithm), low complexity Chase-Pyndiah algorithm, the OSD algorithmand its low complexity variations, or any similar soft decodingalgorithm for decoding of block codes, for example. In general,different well known (or proprietary) soft decoding algorithms can beused in the blocks 515 and 525. All such algorithms are well known tothose of skill in the art, for example, see J. Cho and W. Sung, “Reducedcomplexity Chase-Pyndiah decoding for turbo product codes,” pp. 210-215,IEEE workshop on signal processing systems, October 2011.

It would be desirable to have a decoding architectures that could beused to efficiently decode the new improved CTBC codes. It would bedesirable to have additional efficient algorithms and parallelarchitectures to decode the improved CTBC codes that have undergoneadditional constrained interleaving based signal mapping and/or ratematching and/or constrained interleaving based spatial modulation.

While the above mentioned prior art relating to constrained interleavingfor use with an OBC and an IRCC provide very powerful CTBC codes, theCI-2 is based on the CI-2 design matrix, [A]_(L×ρn), and the concept ofa random interleaver. The construction of the CI-2 requires manyrandomization operations performed in the CI-2 design matrix and acomplicated process of ensuring that randomizations do to not violateany constraints in the CI-2 design matrix. As discussed below, this CI-2design matrix and design process actually limits BER performance. Also,the CI-2 is not a vectorizable/contention free interleaver. Herein a“random interleaver” is also defined in opposition to a “deterministicinterleaver” that uses a mathematical formula to generate thedeterministic interleaver permutation. A random interleaver is thusoften implemented as a table look up or with a state-machine logiccircuit whose sequencing logic does not use a fixed mathematicalequation but whose state transition logic needs to be specificallydesigned for each is frame size.

It would be desirable to have a family of a contention free,vectorizable constrained interleavers, both deterministic andsemi-random. It would be desirable to have an SCC that is constructed bycoupling the output of the OBC to the IRCC via a contention free,vectorizable and deterministic version of a constrained interleaver. Itwould further be desirable to be able to design a system that couldachieve the memory efficient benefits of the Studer reference, and toalso greatly simplify the rate matching requirements of the system. Itwould be desirable to have a parallel architecture that could meet theencoding and decoding performance requirements of the 4G LTE CTCencoders and decoders, but with simpler computational functional units,less overall computational complexity, and thus lower power consumption.It would be desirable to have a CTBC encoder/decoder architecture thatcould eliminate the complicated and hardware intensive rate matching andinverse rate matching subsystems required by 4G LTE encoders anddecoders. It would also be desirable if the parameters of this same CTBCencoder/decoder architecture could be scaled to higher values of Nlevels of parallelism and designed to provide the NCGs need at BERs of10⁻¹⁵ for 400 GHz and beyond OTN applications. It would be desirable toalso have a new coded modulation techniques that could be used to mapcodes onto higher order constellations and to implement advancedfunctions such as rate matching, spatial modulation, and MIMO systems.It would be desirable if the advanced modulation technique could be usedalong with optical integrated circuits and similar technology toimplement higher capacity optical communication channels, for example400 GHz and beyond, and 1 Tera Hz and beyond. It would be desirable tohave a constrained interleaver design process that did not rely on theCI-2 design matrix and was able to provide higher BER performance forrandom and deterministic constrained interleavers.

SUMMARY OF THE INVENTION

Using the abbreviations CI=“constrained interleaver” andCICM=constrained interleaved coded modulation,” and other more commonabbreviations that are all defined herein, the present patentapplication is organized and the present invention can be summarizedinto sub-invention categories as follows:

-   -   i) CI-(L=1) (single row) constrained interleaver, encoder,        decoder.    -   ii) CI-3 constraints and design approach.    -   iii) CI-4 constraints and design approach.    -   iv) CI-3 and CI-4 design approach using one or more target MHDs.    -   v) Vectorizable, deterministic CI, encoder, decoder.    -   vi) Parallel decoder chip architectures.    -   vii) CICM Signal Mapper subsystem    -   viii) CICM Rate Matching subsystem/Variable Redundancy and        Vectorizable embodiments.    -   ix) CICM embodiments with unequal error protection.    -   x) CICM MIMO Spatial Modulation subsystem and processing        algorithms.    -   xi) Optical subsystem and optical IC with signature filters for        use in WDM SM and MIMO OTN, 100G, 400+G, 1 Tera bit, and beyond;        and non-optical embodiments using analog, discrete-time or        purely digital signature filter banks.    -   xii) OFDM Related Embodiments.    -   xiii) System level aspects: handhelds, headends, systems.

In accordance with a first aspect of the present invention, constrainedinterleavers are designed that only use a single row vector as opposedto CI-1 or a CI-2 design matrix which always needs more than one row tomeet non-trivial MHD design objectives. For example, CI-3 and CI-4constrained interleavers are designed by identifying restricted zones ofnumbers where a pseudo-random number generator cannot generate anoutput. These restricted zones correspond to sets of adjacent integerswithin the integer domain [0,K−1]. The length of the constrainedinterleaver is K, and is used to permute the integer ring [0,K−1]=[0, .. . , K−1] to a permuted version of this integer ring, which can bedenoted as π[0,K−1]. An unconstrained pseudo-random permutation can map[0,K−1] to any reordering of [0,K−1]. In contrast, a constrainedinterleaver in accordance with the present invention imposes constraintsthat eliminate any possible reordering that would cause a particularindex of [0,K−1] to be mapped to a position (index) in π[0,K−1] thatwould violate a constraint. The constraints are implemented bysequentially pseudo-randomly permuting (placing) indices (positions) inthe integer ring [0,K−1] to new positions (indices) in π[0,K−1] subjectto the constraint of not permuting any index of [0,K−1] into anyrestricted zone in π[0,K−1]. The restricted zones are used to identifyranges of indices in π[0,K−1] where, if a particular index from [0,K−1]were to be placed, a low weight a CTBC codeword would be/could begenerated. Herein, the phrases “low weight codeword,” “low weight errorsequence,” and “low distance error sequence” generally correspond to anypossible low weight encoded bit sequences, i_(P), of weightsd_(t)≦d≦_(f) where none of the possible low weight encoded bitsequences, i_(P), can have a weight less than d_(t). Here the weightsd_(t)≦d≦d_(f) correspond to Hamming distances, and the coded sequencecan be a CTBC coded sequence or some other kind of encoded sequenceencoded in accordance with a code for which the low weight errorsequences can be identified and enumerated. The interleaver constraintsare used to eliminate the possibility of the generation/existence of anylow weight CTBC codewords that have weight below a target MHD valuedenoted as d_(t).

It should be noted that while U.S. Pat. Nos. 8,537,919 and 8,532,209disclose the general genus of constrained interleavers and certainspecies such as CI-1 and CI-2 species of constrained interleavers, thean aspect of the present invention discloses additional specific novelspecies that members of the genus of constrained interleaver inventions.That is, the present invention specifically discloses the two newspecies of constrained interleavers, CI-3 and CI-4. Both CI-3 and CI-4are members of the newly disclosed sub-genus class SRCI (single rowconstrained interleaving).

SRCI as performed in accordance with the present invention providesseveral advantages. First, the interleaver gain can be improved incomparison to the prior art CI-2 because the number of restricted-outpermutation possibilities decreases with respect to the CI-2 designmethod. Second, it is possible to design CTBC codes that allow differenttarget MHD values to be used for different categories of low weighterror sequences. This is important because certain categories of lowdistance error sequences can be identified that are relatively much lesslikely. These categories of low distance error sequences have lowassociated error coefficients. Therefore, the overall probability oferror can be reduced by allowing lower MHD to these less likelycategories of low distance error sequences. This allows the overallprobability of error to be reduced by balancing MHD and errorcoefficient products in the error probability expression as a functionof distance spectra. A third advantage to the SRCI approach is that itprovides additional flexibility that allows vectorizable (contentionfree) deterministic constrained interleavers to be designed using thesingle row type interleaver constraints.

Another aspect of the present invention focuses on parallel processingarchitectures that can be used to implement chips, systems of chips orchip subsystems for encoding and decoding of CTBC codes. These parallelarchitectures make use of the contention free deterministic constrainedinterleaver along with a parallel-access memory architecture as well asparallel processing units that perform SISO decoding in parallel.

Another aspect of the present invention centers around CICM (constrainedinterleaved coded modulation). CICM signal mapping is used to map codedsequences such as CTBC coded sequences and other coded sequences forwhich the low distance error sequences (lowest weight codewords) can beidentified and tabulated. CICM signal mappers uses a permutation, Γ, topermute the coded bit positions of a coded bit stream of frame of framesize K, onto a sequence of K/m groups of m bits, each of which will bemapped to an 2^(m)-ary symbol in accordance with a selectedconstellation mapping rule. The constellation mapping rule is preferablyuses RGC (reverse Gray coding) to map groups of m bits at a time ontothe 2^(m)-ary signal constellation points. The combination of the CICMpermutation Γ and the constellation mapping rule is preferably designedto ensure that at least one of a symbol Hamming distance and a MSED(minimum squared Euclidian distance) is achieved. This is achieved bykeeping track of a set of low distance error sequences that can begenerated at weights d, where d_(t)≦d≦d_(f). Similar to the CIpermutation π[0,K−1], Γ is constrained to ensure that low distance errorsequences are avoided, but now in terms of symbol Hamming distance andMSED on the transmitted sequence as opposed to the encoded bit streamitself. In preferred embodiments both the symbol Hamming distance andthe MSED are jointly achieved. In unequal error protection embodiments,the symbol Hamming distance and a plurality of different MSEDs fordifferent subsets of message bits are jointly achieved. In all suchsystems mentioned above, when AWGN (additive white Gaussian noise)channels are in use, it may not be needed to maintain a given symbolHamming distance, so that the maintaining/achieving a given symbolHamming distance portion becomes optional. In general, the CICMpermutation and mapping is selected to improve or optimize the netprobability of error on the channel.

CICM can be used to aid in a variety of areas. For example, CICM is usedherein to implement rate matching/puncturing/variable redundancy. Also,CICM is used to implement improved SM (spatial modulation) and MIMO(multiple input multiple output) systems such as multiple-antennawireless systems for potential use, for example, in 5G and beyondwireless systems.

Another aspect of the present invention involves the design of OTN(optical transport network) systems for 100G and beyond fiber optic orfree space laser communication systems. The present invention shows howto design and implement optical subsystems using filter banksconstructed using a plurality of known optical discrete-time filtersthat can be implemented in coupled fiber subsystems and/or opticalintegrated circuits. The optical discrete time filter banks are used toimplement a transmit portion of a MIMO type channel matrix, H. Theoutput of the optical discrete time filter banks is coupled onto asingle fiber or free space optical laser channel for transmission. Atthe receiver, another optical filter bank is used to implement a receiveportion of the MIMO type channel matrix, H. Both SM and MIMO typemodulation formats are disclosed. The SM and MIMO type systems can beused to increase the performance and data rate of the opticalcommunication system at a given noise level.

BRIEF DESCRIPTION OF THE DRAWINGS

The various novel features of the present invention are illustrated inthe figures listed below and described in the detailed description thatfollows.

FIG. 1 is a block diagram of an embodiment of a prior art encoder thatencodes data bits in accordance a constrained turbo block convolutional(CTBC) code and maps the CTBC encoded sequence to a channel fortransmission.

FIG. 2 is a block diagram of an embodiment of a prior art receivermethod and apparatus that makes use of an iterative soft input softoutput (SISO) decoder to decode a received version of a CTBC code suchas generated by FIG. 1.

FIG. 3 is a block diagram of a CTBC code encoder that uses an L=1constrained interleaver designed to provide higher interleaver gainand/or higher MHD and/or improved BER performance as compared to priorart CTBC codes that relied on a CI-2 interleaver with L>1.

FIG. 4A illustrates how the act of placing a next bit of a singlecodeword can give rise to a restricted zone.

FIG. 4B illustrates how the act of placing a next bit of a codeword cangive rise to a restricted zone in a combination of one or morecodewords.

FIG. 5A illustrates an example of how the act of placing a next bit of acodeword can give rise to a restricted zone in a combination of twocodewords.

FIG. 5B illustrates an example of how the act of placing a next bit of acodeword can give rise to a restricted zone in a combination of threecodewords.

FIG. 5C illustrates an example of how the act of placing a next bit of acodeword can give rise to a restricted zone when the Hamming weight ofeach of three codeword is an odd number.

FIG. 6 is a flow chart that illustrates the general concept of how todesign certain classes of L=1 Constrained Interleavers, π_(Cl−L=1):c→u,such as the CI-3 and CI-4 interleavers.

FIG. 7 is a block diagram of a memory structure, addressing logic andpermutation hardware used by certain embodiments of contention freedeterministic constrained interleavers.

FIG. 8 is a flow chart that illustrates a method to design contentionfree deterministic constrained interleavers that also meet theconstraints of either CI-3 or CI-4 interleavers or both.

FIG. 9 is a block diagram of a deterministic constrained interleaverthat uses a local constraints enforcement permutation to modify adeterministic interleaver's permutation function in order to provide anoverall permutation function that corresponds to a deterministicconstrained interleaver (DCI).

FIG. 10 is a flow chart that illustrates a design method used to designthe local constraint enforcer permutation 910 of FIG. 9.

FIG. 11 is a block diagram of a CTBC code SISO decoder that uses arandom or deterministic L=1 constrained interleaver constrainedinterleaver and a 2D memory based interleaver architecture.

FIG. 12 is a block diagram of a memory structure and deterministiccontention free interleaver hardware and addressing logic for use inparallel SISO decoders for CTBC codes.

FIG. 13 is a block diagram of an embodiment of a parallel architecturesuitable for real time VLSI implementation of a SISO decoder designedfor decoding CTBC codes.

FIG. 14 shows an exemplary embodiment of a functional unit used inextrinsic LLR updating (8,4) Hamming code used for the outer block code.

FIG. 15 illustrates a QPSK constellation that uses Reverse Gray Coding.

FIG. 16 illustrates a 16-QAM constellation that uses Reverse GrayCoding.

FIG. 17 illustrates a 8-PSK constellation that uses Reverse Gray Coding.

FIG. 18 illustrates a 16-PSK constellation that uses Reverse GrayCoding.

FIG. 19 is a block diagram of a transmitter, channel, and a receiverthat uses constrained interleaved coded modulation (CICM) in accordancewith an aspect of the present invention.

FIG. 20 is flow chart that illustrates a method to design contentionfree deterministic constrained interleavers for use in the CICMpermutation, Γ.

FIG. 21 is a block diagram of a multi antenna embodiment of aCICM-MIMO-SM system that includes a transmitter, a channel and areceiver.

FIG. 22 is a block diagram of a soft iterative decoder that alsoperforms soft interference cancellation.

FIG. 23 is a block diagram of a embodiment of an optical CICM-SM systemthat includes a transmitter, a channel and a receiver and is designedfor use in fiber optic and other types of laser communications systems.

FIG. 24 is a block diagram of a embodiment of an optical CICM-MIMOsystem that includes a transmitter, a channel and a receiver and isdesigned for use in fiber optic and other types of laser communicationssystems.

FIG. 25 is a block diagram of a embodiment of an SM-OFDM and MIMO-OFDMsystem that makes use of frequency domain spatial channel signaturefilter banks.

FIG. 26 is a block diagram of an alternative embodiment of an SM-OFDMand MIMO-OFDM system that makes use of a spatial channel signaturefilter banks.

FIG. 27 is a block diagram of an exemplary communication system andmethod including two transmitters and two receivers that make use of theserial concatenation coding with constrained interleaving in order tocommunicate between communication endpoint stations.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

Throughout this written description various mathematical algorithms willbe presented in the form of block diagrams. It is to be understood thatin any such cases, the block diagrams can be viewed as hardware blocksor logic blocks that could be carried out in software. Likewise,especially in hardware implementations, a given block in the any blockdiagram herein could be embodied using two or more separate hardwaresub-blocks. Hence all such modifications are contemplated as ways toimplement various aspects and embodiments of the present invention.Also, it should be recognized that any block diagram whose operation isdescribed herein can be viewed as a flow chart, thereby describing amethod in addition to a system or an apparatus.

Constrained Interleaver Mathematical Notation:

A single frame of a CTBC code can be modeled starting from a set of ρindependent message blocks, each of length k, m_(j)=(m_(j1), m_(j2), . .. m_(jk)), j=0, 2, . . . , ρ−1, where ρ is the integer number of messageblocks in a frame. These message blocks are first individually encodedby an (n,k) outer block code (OBC) with minimum Hamming distance (MHD)d_(o) to form a sequence of codewords of the OBC. This sequence ofcodewords of the OBC will be placed into a vector, [c]_(ρn)=[c]_(k),where K=ρn is the frame size. The elements of [c]_(ρn) can be written interms of the codeword positions, c_(j)(c_(j1), c_(j2), . . . c_(jn)),for j=0, 1, 2, . . . , ρ−1, and or in terms of the individual coded bitpositions, c(i), for i=0, . . . , K−1, where i=nj+t, for j=0, 1, 2, . .. , ρ−1 and t=0, . . . , n−1. In this document, the term “codeword”specifically refers to a set of coded bits generated by applying the OBCto a message block, {m_(j)}, while the term “codeword position” refersto physical memory locations where the coded bits of a correspondingcodeword reside. The vector, [c]_(ρn), can be viewed as a memory arraywhose contents are the naturally ordered set of codewords, {c_(j)}, orcan be viewed as a bit-oriented memory array containing “coded bitpositions” where the corresponding coded bits {(c_(j0), c_(j1), . . .c_(jn−1))}, j=0, 1, 2, . . . , ρ−1, physically reside. Also, the term“coded bit position” can refer to a permuted location or address wherethe corresponding coded bit will reside after an interleaving operationhas occurred as described below.

The contents of the vector, [c]_(ρn), can be permuted to form aconstrained to interleaved sequence, π:c→u denoted as, u=π[c]. In termsof a physical interleaver structure, the vector u can also be viewed asa vector of coded bit positions, where the coded bit positions (and/ortheir addresses) are in a permuted order with respect to the coded bitpositions in the vector c. The sequence u is then encoded according toan inner recursive convolutional code (IRCC) to form the final codedsequence v=(v₁, v₂, . . . v_(Lρn+ν)) of the CTBC code, where v is thenumber of additional terminating bits added by the IRCC. In terms of thegenerator function G(D) of the IRCC, this conversion from u to v canalso be described as v(D)=G(D)u(D), or, in vector notation,v=G[u]=G[π[c]].

In the analysis herein, the IRCC is assumed to be the modulo-2accumulator, i.e., G(D)=1/(1+D), where then ν=1. However, in case of anaccumulator this single termination bit can be eliminated as itcontributes the same bit metric resulting from the same coded bit forthe two paths terminating at state zero. Using this modulo-2accumulator, when the Hamming weight of u, W[u], is an even value d, theCTBC coded sequence v consists of d/2 number of disjoint segments of allones. Similarly, when W[u] is an odd value d, v consists of┌d/2┐=(d+1)/2 number of disjoint segments of all ones including onesegment that ends at the last bit of the sequence v. The interleaverconstraints developed herein put restrictions on the permutation π:c→uso as to selected categories of the low distance error sequences of thefinal CTBC codeword, v=G[u]=G[π[c]]. That is, constraints are placed onthe permutation π:c→u to ensure that the minimum weight of v generatedby any vector c is at least d_(t), where d_(t) is a target MHD of theCTBC code. As discussed later, the constraints developed herein can beapplied to any general IRCC with any arbitrary G(D).

CTBC Code Encoder/Transmitter Using a CI with a Single Row:

The present invention introduces a new family of L=1 constrainedinterleavers (Single Row Constrained Interleavers—SRCI) that are basedon a new type of constraint that directly restricts (i.e., constrainsout) a particular subset of zero or more indices in the vector u towhich a given coded bit of an associated OBC codeword cannot be placed,given the previous placement coded bits of the current codeword into uand possibly coded bits of other codewords of the OBC that have alreadybeen placed into the vector u. As stated above, the prior art CI-2constrained interleaver required the use of [A]_(L×ρn) that necessarilyrequired L>1 in order to meet a specified target MHD requirement. Henceusing the prior art techniques, it would be impossible to set L=1 inorder to meet a set of interleaver constraints that would enforce aspecified target MHD d_(t)>d₀d_(i) because both intra-codeword bitseparations and the inter-row constraints would be needed, thus forcingL>1. Also, using prior art constrained interleaving techniques it wouldnot be possible to design a deterministic constrained interleaver asdefined below which has contention free properties and is compatible andused along with a pre-defined deterministic contention free permutationsuch as the QPP permutation.

FIG. 3 shows a CTBC encoder 301 in accordance with an aspect of thepresent invention. FIG. 3 will be described assuming the OBC is a (n,k)block code. In general, any other type of finite length codes, forexample, a tail biting convolutional code or an LDPC code could be usedfor the OBC as well. An outer block code (OBC) encoder 306 receives asequence of independent message blocks, each of length k, m_(j)(m_(j1),m_(j2), . . . m_(jk)), j=0, 1, . . . , ρ−1. The OBC encoder 306 encodesthese independent message blocks to produce an OBC-encoded sequence,c_(j)=(c_(j1), c_(j2), . . . c_(jn)), j=0, 1, . . . , ρ−1, whichcorresponds to a length-K vector, c. Next, OBC-encoded sequence,c_(j)=(c_(j1), c_(j2), . . . c_(n)), j=0, 1, . . . , ρ−1, is passed to apre-determined L=1 constrained interleaver block 311 (herein also calledan SRCI—single row constrained interleaver) which permutes the vector caccording to π_(Cl−L=1):c→u. The permutation π_(Cl−L=1) can beimplemented as an L=1 pseudorandom constrained interleaver (e.g., CI-3or CI-4 interleavers as discussed below). The vector u is then passed toan IRCC encoder 316 which produces at its output the vector, v, of theCTBC code as previously described. The vector v is then passed to aconstellation mapper 321. The constellation mapper 321 can be any formof modulator, for example, the exact type of modulator specified for usein 4G LTE, OTN, or any other type of modulator. In some embodiments, theconstellation mapper 321 uses a constrained interleaved coded modulation(CICM) constellation mapper as discussed in more detail below and asshown in FIGS. 15-20.

In a specific example, an L=1 constrained interleaver (also called aSRCI) is used to construct a transmitter to generate and transmit CTBCcodes. In such a transmitter, an outer encoder is configured totransform a sequence of input bits to a sequence of outer encoded bits.The sequence of outer-encoded bits is encoded in accordance with anouter code that can be block code (which would include an LDPC code) ora non-recursive convolutional code, for example. A constrainedinterleaver would be configured to implement a permutation function topermute the order of the outer-encoded bits to produce aconstrained-interleaved sequence of outer-encoded bits. The constrainedinterleaver implements at least one SRCI (single row constrainedinterleaver) constraint that prevents one or more low-distance errorsequences from occurring. The permutation function also implements apseudo-random reordering of the outer-encoded bits subject to the atleast one SRCI constraint. An inner encoder is configured to encode theconstrained-interleaved sequence of outer-encoded bits into a sequenceof inner-encoded bits. A constellation mapper is used to map thesequence of inner-encoded bits to a transmission signal such as a BPSKsignal, a QPSK signal, a 16-QAM signal, or a 16-PSK signal, for example.In this example, the sequence of inner-encoded bits constitutes aserially-concatenated sequence of bits that incorporates coding fromboth the inner code and the outer code in accordance with aserially-concatenated code that achieves a target minimum distance ofd_(i). The outer code has a minimum distance of d_(o) and the inner codehas a minimum distance of d_(i). In this example, the permutationfunction implemented by the SRCI constrained interleaver is configuredto implement the SRCI constraint in order to enforce d_(t)>d₀d_(i). TheSRCI constraint ensures that the permutation function does not place anyrespective index from the integer ring [0,K−1] into any position in apermuted integer ring π[0,K−1] that corresponds to any identifiedrespective restricted zone. Each identified respective restricted zonecorresponds to a subset of one or more adjacent positions in π[0,K−1]that, if the respective index were to be placed into any one of theidentified respective restricted zones, at least one error sequence ofweight less than d_(t) would become possible in theserially-concatenated code.

Observations Regarding the CI-2 Interleaver:

The design of the specific classes of permutation functions implementedby the L=1 constrained interleavers of FIG. 3, is based upon someobservations regarding the bit error rate performance of CTBC codesconstructed using the CI-2 interleaver. These observations will be usedbelow to develop the CI-3 and CI-4 interleavers. The CI-3 and CI-4interleavers developed below are used to construct CTBC codes withimproved BER performance as compared to CTBC codes constructed using aCI-2 interleaver.

1. The MHD of a CTBC codeword, W[v] is a sum of the distances betweennon-zero coded bits in u. Starting the count with zero, a string of onesin v begins at the position of each even numbered non-zero coded bit inu and ends at the position immediately before each odd numbered non-zerocoded bit in u.

2. The effect of the parameter L on certain key error coefficients ofCTBC codes constructed using a CI-2 can be seen directly in equations(2) and (6) of the Fonseka reference [1]. These error coefficients areminimized with respect to L when L=1. When the CI-2 interleaver is usedto construct a CTBC code, increasing L leads to higher values of MHD,but also lower values of the interleaver gain. This is because when theCI-2 design matrix, [A]_(L×ρn), is read in column-major order to createthe sequence u, any two coded bits of any given codeword of the OBC willhave a separation of at least L bits in u. However, note that the framesize is K=Lρn, and for fixed values of K, and n, the value of ρ ismaximized when L=1. Therefore, decreasing L increases the number ofcodewords of the OBC, ρ, that can be placed on any single row. Theinterleaver gain, which increases with the number of possiblepermutations of coded bits in u, thus increases as L is lowered and ismaximized when L=1.

3. The inter-row constraints in CI-2 were introduced to ensure that twonon-zero codewords of the OBC placed on two different rows of [A]_(L×ρn)will cause to be generated a CTBC codeword, v, that has a weight, W[v],that is greater to or equal to the target MHD. With the CI-2 inter-rowconstraints, when coded bits of a codeword c₁ on row i and a codeword c₂on row (i−l) are observed in pairs (with one coded bit from c₁ and theother from c₂) in the sequence u, the inter-row constraints ensure thatonly up to κ(l) such pairs are allowed have a separation of l in u, fora set of considered/constrained row separations l=1, . . . , l_(max).Further, the inter-row constraints and the reading of [A]_(L×ρn) incolumn-major order ensure that all remaining pairs have at least aseparation of (L−l) in u, up to a maximum of l_(max). For example,consider the placement of coded bits of a codeword c₂ in u when κ(l)=1,for l=1, 2, . . . , l_(max). Then if codeword c₁ has a coded bit with al≦lmax bit separation from a coded bit of c₂ in u, the inter-rowconstraints ensure that the separation between every other coded bit ofc, and every other coded bit of c₂ has to be at least (L−l).

4. Additionally, the act of reading the CI-2 design matrix, [A]_(L×ρn),in column-major order introduces an inherent constraint. This inherentconstraint deals with codewords separated by l_(max)+1 rows. In order tounderstand the inherent constraint, consider a typical example asprovided in the Fonseka reference [1] where L is selected asL=2(l_(max)+1) and (l_(max)+1)=d_(t)/d₀ in order to achieve a target MHDof d_(t)=d₀ ². In particular, consider the specific case where d₀=4,d_(t)=d₀ ²=16, l_(max)=3, and L=8.

Given the above example, consider the case where three codewords c₁, c₂and c₃ of the OBC that have placed into consecutive rows of [A]_(L×ρn).When [A]_(L×ρn) is read in column major order, if the separation betweena coded bit of c, and a coded bit of c₂ on u is one, and the separationbetween a coded bit of c₂ and c₃ is also one, then the row-columnstructure of [A]_(L×ρn) ensures that the separation between any codedbit of c₁ and a coded bit of c₃ has to be at least 2. Similarly, whenL=8 and l_(max)+1=4, if c₁, c₂, c₃ and c₄ are codewords of the OBC areplaced on consecutive rows of [A]_(L×ρn), then the minimum possibleseparation between each {c_(i), c_((i+1))}, i=1 . . . , 3 is one, andthe minimum possible separation between c₁ and c₃ and c₂ and c₄ is atleast two.

If it happens to be that the actual minimum separation between coded bitpairs in codewords c₁ and c₃, and the actual minimum separation betweencoded bit pairs in c₂ and c₄ are both 2, then the minimum separationbetween coded bit pairs in c₁ and c₄ will have to be at least 3. Due tothe row-column structure of a CI-2, when κ(l)=1 for l=1, 2, . . . ,l_(max), this inherent constraint prevents the generation of codedsequences v with weight less than d_(t) from three through l_(max)number of codewords of the OBC. This inherent constraint ensures theminimum weight of coded sequences v generated by three through l_(max)codewords of the OBC is dependent on L since all remaining n−1 pairs ofcoded bits have at least a separation of (L−l) in u. However, theminimum weight of sequences of v generated by (l_(max)+1) codewords ofthe OBC is independent of L. The act of (implicitly) reading of therow-column matrix structure of [A]_(L×ρn), in column major order adds inthis inherent constraint that is not explicitly called out as a separateconstraint in the Fonseka reference [1], or in U.S. Pat. Nos. 8,537,919and 8,532,209.

5. CI-2 is structured to maintain the same MHD for all codewords of theconcatenation regardless of whether they are generated by one non-zerocodeword of the OBC or whether they are generated by combinations of twoor more non-zero codewords of the OBC. Observe that different categories(subsets) of CTBC codewords, {v} can be defined in terms of the numberof codewords of the OBC that combine to form a potentially low weightCTBC codeword, v, at a given distance, d. Further, observe that theinterleaver gain of CTBC codewords in each different category ofcodewords has a corresponding different category-level errorcoefficient.

To understand this further, note that the asymptotic bit error rate(BER) of any CTBC code is determined by the error contributions of thecodewords according to

$\begin{matrix}{P_{e} \approx {\sum\limits_{d}\; {A_{d} \times \left( {d,\gamma_{b}} \right)}}} & (4)\end{matrix}$

where A_(d) is the error coefficient of the corresponding weight dcodewords, and R(d,γ_(b)) is the probability of decoding in favor of aCTBC codeword with a weight d error sequence at a bit signal to noiseratio of γ_(b)=E_(b)/N₀. As per Lemma 1 of the Fonseka reference [1],CTBC codes can be designed with a CI-2 to eliminate the errorcontributions in equation (4) associated with all CTBC codewords havingweights d which are below a selected target MHD, d_(t). At the sametime, the A_(d) values of the remaining terms in equation (4) can bereduced to be close to the A_(d) values associated with uniforminterleaving. This simultaneous elimination of the lower weight errorterms in equation (4) and the reduction of the remaining A_(d) valuesallows powerful CTBC codes to be constructed starting from simplecomponent codes. However, it is further observed here that theindividual A_(d) values at each given distance, d, can also besub-divided down to a finer granularity by considering differentcategories of codewords that have the same distance, d, but differenterror coefficients.

6. CI-2 is structured to maintain the same MHD for all codewords of theconcatenation regardless of their category, i.e., whether they aregenerated by one or two or more non-zero codewords of the OBC. Observethat the final bit error probability of (1) can be lowered by usingdifferent values of d_(t) for different categories of codewords. Forexample, if the error coefficient for a certain category of codewords ismuch less than the error coefficient for another category of codewords,the error probability of equation (4) can be lowered by using a higherd_(t) value for the category of codewords with the much lower errorcoefficient. This is because the number of possible error sequences inthe category of codewords with the much lower error coefficient is muchfewer.

7. The standard CI-2 construction treats any combination of d₀ non-zerocoded bits of a codeword of the (n,k) OBC as a codeword whether or notthat combination is actually a codeword. Observe that the actual numberof codewords of the OBC with weight d₀ is usually lower than the totalpossible number of permutations of the d₀ non-zero coded bit positionsof each codeword position, cj. Hence, many combinations of one or morecodewords, each containing d₀ non-zero coded bits, will not actuallycorrespond to valid combinations of one or more codewords of the OBC.Such invalid combinations should be ignored to increase interleaver gainwhenever possible.

CI-3 Interleaver Constraints to Meet a Target MHD:

A CI-3 interleaver is defined in accordance with a set of Constraints1-4 as defined in this section. Constraints 1-4 provide similarrestrictions as the CI-2 constraints, but do so in a manner so as toavoid the use of the CI-2 design matrix, [A]_(L×ρn). This allows the useof L=1 and thus allows CTBC codes to be constructed with higherinterleaver gains than can be achieved with a CI-2 interleaver. Justlike the CI-2, constraints 1-4 can be used to achieve a target MHD,d_(t), as high as d_(t)=d₀ ². In the next section, an additionalconstraint, Constraint 5, is defined that also allows even higher targetMHDs to be reached than is possible with CI-2 interleavers.

To start, a parameter s₁ is defined to identify a spacing requirementbetween coded bits of a single non-zero codeword of the OBC. All ofConstraints 1-5 put constraints directly in the sequence u without theuse of [A]_(L×ρn). The parameter s₁ performs a similar function as L inthe CI-2 interleaver, but defines a separation requirement directlyapplied to the coded bit positions of codewords as opposed to definingthe number of rows of the CI-2 design matrix.

Constraint 1:

Constraint 1 is used to prevent low distance error events/sequences fromoccurring among a first category of CTBC codewords, denoted Φ₁,generated as v=G[u]=G[π[c]]εΦ₁, where c consists of a single non-zerocodeword of the OBC having the minimum weight d_(o). Constraint 1ensures that any two coded bits of every codeword of the OBC must haveat least a separation of s₁ positions between them on u, where s₁ ischosen to satisfy:

$\begin{matrix}{s_{1} \geq \left\{ {\begin{matrix}{\frac{2d_{t}}{d_{o}},} & {d_{0}\mspace{14mu} {is}\mspace{14mu} {even}} \\{\frac{2d_{t}}{d_{o} - 1},} & {d_{0}\mspace{14mu} {is}\mspace{14mu} {odd}}\end{matrix}.} \right.} & (5)\end{matrix}$

With this constraint, the resulting sequence vεΦ₁ will contain [d₀/2]segments of all ones, and each such segment will have at least weights₁. Therefore, W[v]≧┌s₁*d₀/2┐≧d_(t).

Constraint 1 can be easily handled using the pre-selected value of s₁.For example, when finding a position for a coded bit (nj+t) of acodeword of any codeword position, c_(j), all positions within s₁locations in u away from any already positioned coded bits, π(nj+t) fortε{0, . . . n−1}, of that codeword correspond to restricted locations inthe vector u where the bit (nj+t) cannot be placed. The term “restrictedzone” is used herein to denote the set of restricted locations in thevector u where the bit (nj+t) cannot be placed. The interleaver gainassociated with codewords in category 1 is the lowest relative to allthe other categories of codewords discussed in this section. This isbecause there are more ways to generate codewords in category 1 than anyother category identified herein.

Constraint 2:

Constraint 2 is used to prevent low distance error events/sequences fromoccurring among a second category of CTBC codewords, denoted Φ₂,generated as V=G[u]=G[π[c]]ε(D₂, where c consists of two non-zerocodewords of the OBC, each having the minimum weight d_(o). Constraint 2ensures that, if a coded bit of any codeword position c_(j) and a codedbit of any other codeword position c_(j1) have a spacing of exactly(l_(max)+1), then all other coded bits of c_(j) must have a separationof at least (l_(max)+1) positions from every other coded bit of eachcodeword position c_(j1) on u. The parameter, l_(max), is chosen tosatisfy (l_(max)+1)=┌d_(t)/d₀┐. With this constraint, the resultingsequence vεΦ₂ will have d_(o) segments of ones, each with weight of atleast (l_(max)+1)=┌d_(t)/d₀┐. This ensures thatW[v]≧(l_(max)+1)d_(o)≧d^(t).

Constraint 2 can be handled by storing a list of pairs of codewordspositions (c_(j), c_(j1)) that have a coded bit of c_(j) and a coded bitof c_(j1) separated by exactly (l_(max)+1) positions. When finding aposition for a coded bit of c_(j), if c_(j) happens to be on that listcoupled with c_(j1), all positions within (l_(max)+1) from the remainingcoded bit positions of c_(j1), need to be added to the restricted zone.Note that if they are all bigger, then there is no constraint.

The impact of the inter-row constraints related to rows i and (i−l)which is defined using k(l) in traditional CI-2 is to ensure that (a)coded bits of a codeword c₁ can pair up with only up to k(l) number ofpairs with coded bits of a codeword c₂ on u, where a pair is formed by acoded bit of c₁ and a coded bit of c₂ at a separation of l, and (b)while all remaining pairs of coded bits of c₁ and c₂ maintain at least aseparation of (L−l), for l=1, 2, . . . l_(max), where, l_(max) is foundaccording to d₀(l_(max)+1)≧d_(t). In SRCI constraint 2, the same l_(max)value as in traditional CI-2 is used. Preserving the same impact,inter-row constraints of SRCI can be enforced as: (a) no more than k(l)number of pairs of coded bits from any two codewords c₁ and c₂ areallowed to have a separation of l or less on u, and (b) if l_(a)(≦k(l))number of pairs have separations, l₁, l₂, . . . , l_(la), (where eachl_(x)≦l, x=1, 2, . . . , l_(a), then all remaining pairs need to have aseparation of more than

$\left\lceil {\left( {d_{t} - {\sum\limits_{p = 1}^{l_{a}}\; l_{p}}} \right)/\left( {d_{0} - l_{a}} \right)} \right\rceil,$

for all l=1, 2, . . . l_(max).

Constraint 3:

Constraint 3 is used to prevent low distance error events/sequences fromoccurring among a third category of CTBC codewords, denoted Φ₃,generated as v=G[u]=G[π[c]]εΦ₃ where, similar to Constraint 2, cconsists of two non-zero codewords of the OBC, each having the minimumweight d_(o). Constraint 3 ensures that, if the two nearest coded bitsof a codeword position c_(j) and a coded bit of a codeword positionc_(j1) have a separation of l<(l_(max)+1), then only up to a total ofκ(l) (<d0) such pairs of coded bits of c_(j) and c_(j1) may have theseparation of less than (l_(max)+1), and all the rest of the (n−κ(l))coded bits of c_(j) must have a separation of at least s₂(l)=(s₁−l)positions from every other coded bit of c_(j1). In the selection of κ(l)values, l=1, 2, . . . , l_(max), note that the lowest weight category 3CTBC codewords will consist of (a) κ(l) number of segments each withweight between l and l_(max) and (b) (d_(o)−κ(l)) number of additionalsegments, each with weight of at least s₂(l)=(s₁−l). Hence, to assureW[v]≧d_(t), κ(l) is selected to ensure that lκ(l)+(d₀−κ(l))s₂(l)≧d_(t).Equivalently, κ(l) is selected as

$\begin{matrix}{{{\kappa (l)} < \frac{{d_{0}\left( {s_{1} - l} \right)} - d_{t}}{s_{1} - {2\; l}}},{l = 1},2,\ldots \mspace{11mu},{l_{\max}.}} & (6)\end{matrix}$

In the case κ(l)=1 for l=1, 2, . . . , l_(max), when a coded bit of acodeword position c_(j) is l(<=l_(max)) positions away from a coded bitof c_(j1), every remaining coded bit of c_(j) has to be positioned atleast (s₁−l) positions away from every other coded bit of c_(j1). Thecase κ(l)=1 for l=1, 2 . . . , l_(max), with the introduction ofconstraint 4 below, can generate powerful concatenations with MHD valuesof d_(t)=d₀ ² as discussed in the Fonseka reference [1].

Constraint 3 can be implemented by checking to see that if a coded bitof codeword position c_(j) and a coded bit of codeword position c_(j1)have a separation of l, then when finding a position for a coded bit ofc_(j), all positions within (s₁−l) away from already placed other codedbits of c_(j1) should be designated as restricted zones.

Constraint 4:

Constraint 4 is used to prevent low distance error events/sequences fromoccurring among a fourth category of CTBC codewords, denoted Φ₄,generated as v=G[u]=G[π[c]]ε₄, where c consists of 3, . . . ,(l_(max)+1) non-zero codewords of the OBC. Constraint 4 ensures that, ifa set of codewords c_(j) _(_) _(h), h=1, 2, . . . , p for eachp≦(l_(max)+1) are placed on u in such a way that the minimum separationbetween two coded bits of c_(j) _(_) _(h) and c_(j) _(_) _(h+1) is onefor h=1, 2, . . . , (p−1), then the minimum separation between everycoded bit of c_(j) _(_) _(h) and every coded bit of c_(j) _(_) _(h+2)has to be at least 2 for h=1, 2, . . . (p−2). In addition, once thecoded bits are randomly placed, if they are placed in such a way thatthe actual minimum separation of coded bits of codewords c_(j) _(_) _(x)and c_(j) _(_) _(x+y) is y for y=2, 3, . . . s, x=1, 2, . . . ,(l_(max)+1−y), s+x<(l_(max)+1), then the minimum spacing between everycoded bit of c_(j) _(_) _(x) and c_(j) _(_) _((x+y+l)) has to be atleast (y+1).

Constraint 4 is designed to implement the inherent constraint discussedabove in connection with CI-2, but can be used when L=1, i.e., there isno CI-2 design matrix, [A]_(L×ρn), that is read in column-major order.With Constraint 4 as stated above, if the minimum separation of codedbits of codewords 1 and 2, codewords 2 and 3, and codewords 3 and 4 areall one, and the minimum separation of coded bits of codewords 1 and 3,and codewords 2 and 4 (each of which should be at least 2) happens to beactually 2, then the minimum separation of coded bits of codewords 1 and4 has to be at least 3. Constraint 4 thus makes the L=1 implementationfunction like a standard CI-2 where the reading of [A]_(L×ρn) in columnmajor order automatically/inherently adds in the above-mentionedinherent constraint.

Constraint 4 can be efficiently implemented by monitoring neighboringcodewords of every coded bit on u. Let us identify the n-bit codewordsby their identification numbers, c_(j), for j=0, . . . , ρ−1. For eachcodeword c_(j), for j=0, 1, . . . , ρ−1, and for each s=1, 2, . . . ,(l_(max)+1), prepare a respective list of neighboring codewords,Ln_(j)(s) whose list entries identify all of the neighboring codewordsof c_(j) in u that have a coded bit at a minimum separation of srelative to any of the n coded bits of the codeword c_(j). Note thateach of these lists is an array with at most 2n entries. Once thesequence u starts to fill up, these lists of neighbors begin to fill upto their maximum value of at most 2n entries. When selecting a positionfor a coded bit position (nj+t) of codeword position, c_(j), the listsLn_(j)(s) are consulted. Suppose that c_(jx) is an entry of Ln_(j)(1),and c_(jy) is an entry of Ln_(jx)(1). Then mark as a restricted zone oneposition around each coded bit of codeword c_(jy) when placing coded bitposition (nj+t).

When κ(l)=1 for l=1, 2, . . . , l_(max), Constraint 4 prevents thegeneration of coded sequences v with weight less than d_(t) from threethrough (l_(max)+1) number of codewords of the OBC. Together,Constraints 1-4 ensure the minimum weight of coded sequences v generatedby three through l_(max) codewords of the OBC is dependent on s₁.However, the minimum weight of sequences of v generated by (l_(max)+1)or more codewords of the OBC is independent of s₁. The minimum weightgenerated by a combination of (l_(max)+1) codewords of the OBC can befound by considering the worst case placement of coded bits of(l_(max)+1) codewords when placed in accordance with Constraints 1-4. Asdiscussed in Lemma 1 of the Fonseka reference [1], the minimum weight ofsequences of v generated by (l_(max)+1) or more codewords of the OBClimits the MHD that is achievable by CTBC code constructed with a CI-2to be d_(t)=d₀ ². Constraints 1-4 can be used to generate concatenationswith MHD d_(t)≦d₀ ² but now with a higher interleaver gain since L=1.

Furthermore, note that when L=1 and Constraints 1-4 are applied asdescribed above, the worst case placement of (l_(max)+1) weight d₀codewords of the OBC creates the following sequences of ones in v: onesequence with weight d₀, two sequences with weight (d₀−1) and so on upto d₀ sequences of ones with weight 1. Therefore, the worst case weightgenerated by (l_(max)+1) codewords is,

$\begin{matrix}{{\sum\limits_{i = 1}^{d_{0}}\; \left\lbrack {i\left( {d_{0} - i + 1} \right)} \right\rbrack} = {{d_{0}\left( {d_{0} + 1} \right)}{\left( {d_{0} + 2} \right)/6.}}} & (7)\end{matrix}$

Note that the resulting MHD is thus d_(t)=d₀(d₀+1)(d₀+2)/6, and this isgreater than d₀ ² for d₀>2: However, to reach this target MHD greaterthan d₀ ² it is necessary to also satisfy Constraint 5 as provided inthe next section to prevent the generation of any additional low weighterror sequences that can give rise to a coded sequence v with weightless than d_(t)=d₀(d₀+1)(d₀+2)/6 that can arise from combinations of2d₀, (2d₀+1), . . . , └2d_(t)−1)/d₀┘ codewords of the OBC.

That is, if only Constraints 1-4 are applied, the resulting CI-3interleaver can be designed to achieve the MHD that is achievable byCTBC code constructed with a CI-2, i.e., d_(t)=d₀ ². Constraint 5 canadditionally be enforced in order to reach d_(t)=d₀(d₀+1)(d₀+2)/6>d₀ ².This causes the relatively few additional possible error sequences dueto combinations of up to └2d_(i)−1)/d₀┘ codewords to be eliminated bypreventing the generation sequences with weight between d₀ ² andd_(t)=d₀(d₀+1)(d₀−2)/6.

A CI-3 interleaver can also make use of Constraint 5 be to provide stillhigher MHDs, i.e., with d_(t)>d₀(d₀+1)(d₀+2)/6. To do this, theConstraints 1-4 are applied as outlined above, and Constraint 5 isapplied using the CI-4 design approach of the next section, using aselected target MHD, d_(t)>d₀(d₀+1)(d₀+2)/6. This mixed CI-3/CI-4interleaver can reduce the size of the restricted zones as compared tothe straight CI-4 interleaver design approach described below that canreach the same target MHD. This can provide increased interleaver gainand make it easier to find CI-3 interleavers as compared to a CI-4interleaver designed at a selected target MHD. Hence it is to beunderstood that CI-3 interleavers with d_(t)>d₀ ² and evend_(t)>d₀(d₀+1)(d₀+2)/6 can be constructed by additionally applying theCI-4 design approach of the next section, but also enforcing anyrestricted zones from Constraints 1-4 at the same time. Once Constraints1-4 are enforced, the restricted zones that arise due to Constraint 5will be greatly reduced, thereby shifting a large load of restrictedzones from the Constraint 5 to Constraints 1-4. Since Constraints 1-4are more restrictive than Constraint 5, when applied along withConstraint 5, they will tend to make it easier to find CI-4 solutionsand to potentially lower increase interleaver gain due to a smallernumber of restricted zones when compared to applying CI-4 on its own.

As mentioned in observation 6 of the CI-2 as discussed above, it ispossible to use different values for the target MHD, d_(t), fordifferent categories of codewords. The categories of codewords whosecategory-level error coefficient is lowest, for example, can use a lowerMHD value to cause the overall probability of error in (1) to belowered. This use of multiple target MHD's is applied similarly in bothof the CI-3 and CI-4 interleaver designs, so the multiple target MHDversions of the CI-3 and CI-4 interleavers will be described after theCI-4 interleaver is developed.

CI-4 Interleaver Design Approach to Meet a Target MHD:

In the CI-4 interleaver design approach, the coded bits of codewords ofthe OBC are pseudo-randomly placed directly into the sequence u in sucha way as to maintain a target MHD, d_(t) of a concatenation of non-zerocodewords. The CI-4 interleaver is designed to be as close to a uniforminterleaver as possible while simultaneously maintaining the MHD atd_(t). In the raw CI-4 approach, there is only one constraint, namelyconstraint 5. Interleavers that only use Constraint 5 are called CI-4interleavers. Interleavers that use one or more of Constraints 1-4 andadditionally enforce Constraint 5, are called mixed CI3/CI-4interleavers.

Constraint 5:

Constraint 5 requires that the coded bits of combinations of an integernumber, N_(c), of nonzero codewords of the OBC are positioned in u suchthat W[v]>d_(t), for some specified target MHD, d_(t), wherev=G[u]=G[π[c]] is the CTBC codeword generated from the combination ofcodewords of the OBC in c.

In CI-4 interleavers, different categories of CTBC codewords thatcorrespond to different types of error sequences are denoted, Φ_(m)^((CI-4)), m=1, . . . , N_(c). The category Φ_(m) ^((CI-4)) includes allweight d CTBC codewords formed by a combination of m≦N_(c) non-zerocodewords of the OBC with the minimum weight d_(o). In mixed CI-3/CI-4interleavers, all of the categories of codewords discussed in connectionwith the CI-3 interleaver can exist, and additionally, the categoriesΦ_(m) ^((CI-4)), m=1, . . . , N_(c) are defined. If there is any overlapbetween the CI-3 categories and the CI-4 categories, the CI-3 categoriestake precedence and any remaining combinations of codewords that are notalready accounted for by a CI-3 category are accounted for in the CI-4categories. With this definition, no single CTBC codeword can fall intomore than one category.

To understand how Constraint 5 can be implemented, consider the examplewhere the OBC is an (8,4) Hamming code and the IRCC is an accumulator.Table 1 enumerates the 16 different codewords of the (8,4) extendedHamming code. Note that 14 of these codewords have weight d_(o)=4, onehas weight 8, and one has weight zero. Next recall that the vector ccontains codeword positions c_(j)=(c_(j1), c_(j2), . . . c_(jn)), forj=0, 2, . . . , ρ−1. To start, consider the case where Nc=1, so thatonly one non-zero codeword need be considered. Let a₁<a₂<a₃<a₄ be theordered set of indices of where the four ones of a corresponding weightd₀=4 codeword are placed into u by the permutation, π. Then v=G[u] willhave all zeros, except for a string of ones starting at a₁ andterminating at a₂, and another string of ones starting at a₃ andterminating at a₄. Hence for this codeword, Constraint 5 will requirethat all 14 of the weight d_(o)=4 codewords in Table 1 satisfyW[v]=(a₂−a₁)+(a₄−a₃)≧d_(t). Constraint 5 will require also thatW[v]=(a₂−a₁)+(a₄−a₃)+(a₅−a₆)+(a₇−a₈)≧d_(t) is satisfied by codeword #16in Table 1 and whose eight ones are placed on the ordered set of indicesin u, a₁<a₂<a₃<a₄<a₅<a₆<a₇<a₈.

TABLE 1 (8, 4) Hamming Codewords 1) [0 0 0 0 0 0 0 0] 2) [1 0 0 0 1 0 11] 3) [0 1 0 0 1 1 0 1] 4) [0 0 1 0 0 1 1 1] 5) [0 0 0 1 1 1 1 0] 6) [11 0 0 0 1 1 0] 7) [1 0 1 0 1 1 0 0] 8) [1 0 0 1 0 1 0 1] 9) [0 1 1 0 1 01 0] 10)  [0 1 0 1 0 0 1 1] 11)  [0 0 1 1 1 0 0 1] 12)  [1 1 1 0 0 0 01] 13)  [1 1 0 1 1 0 0 0] 14)  [1 0 1 1 0 0 1 0] 15)  [0 1 1 1 0 1 0 0]16)  [1 1 1 1 1 1 1 1]

The above expressions for W[v] form sums based upon “pairs” of indicesof where the ones of a corresponding codeword of the OBC are located inthe vector u. The locations of the ones that make up the pairs areidentified starting from left to right in u. These pairs are important,because they each give rise to a respective string of ones in the vectorv. Each string of ones in v begins at the location of the first one ineach pair and ends at the location right before the second one in eachpair of ones in u. In the above expressions for W[v], the weight d_(o)=4codeword has pairs given by (a₁,a₂) and (a₃,a₄), and the weight 8codeword has pairs given by (a₁,a₂), (a₃,a₄), (a₅,a₆), and (a₇,a₈). A“doublet” is defined as a pair that generates weight one in v, e.g., theordered indices a₁, and a₂ form a doublet if (a₂−a₁)=1. The conditionthat Constraint 5 avoids, i.e., W[v]<d_(t), can generally occur due toformation of low weight pairs, and in the worst case, doublets.

Note that when bit (nj+t) of the vector c is placed into a locationπ(nj+t) on u to maintain W[v]≧d_(t) for all combinations of i=1, 2, . .. ,

${N_{c} = \left\lfloor \frac{2d_{t}}{d_{0}} \right\rfloor},$

(i.e., 2dt/d₀ for the case where d₀ is odd, and 2dt/d₀ rounded down tothe nearest integer for the case where d₀ is odd) number of thecodewords, then additionally, all combinations of a total of Nx>Ncnumber of codewords of the OBC with c_(j) will also satisfy W[v]≧d_(t).This is the case because when there are total of

$N_{c} = \left\lfloor \frac{d_{t}}{d_{0}} \right\rfloor$

codewords, each with the minimum weight d₀, this combination ofcodewords generates at least a total of d₀Nc=d₀└2dt/d₀┘=2dt ones in u.In the worst case, each of these ones will pair up to form2d_(t)/2=d_(t) doublets u, each generating a weight of 1 in v, thereforeW[v]>d_(t). Increasing Nx beyond Nc can only increase W[v] beyond thisworst case value. Also, if higher weight codewords are involved, thisalso will only increase W[v] beyond the worst case value.

The paragraph above shows that when Constraint 5 is implemented bychecking all combinations of only up to Nc=└2dt/d₀┘ number of non-zerocodewords of the OBC, that W[v]≧d_(t) will satisfied for all possiblecombinations of nonzero codewords in c. Additionally, note that if thetarget MHD of the CTBC codeword, v, is d_(t)=20 and the MHD of the OBCis d₀=4, then Nc=└40/4┘=10, while if d_(t)=16, then Nc=└32/4┘=8. Thatis, the higher the target minimum distance, d_(t), of the CTBC codewordv relative to the MHD of the OBC, d₀, the more combinations ofcodewords, Nc, need to be considered to maintain W[v]≧d_(t).

The permutation π can be built up sequentially. For example, each bitfor t=0, 2, . . . 7 of any coded bit position within the codewordposition, c_(j), in c, can be “placed” one bit at a time. The term“placed” is action of identifying that a bit location nj+t in c will bepermuted to a location π(nj+t) in u. A codeword such as codeword #13 inTable 1 is said to have been “completed” once enough coded bits ofc_(j), have been placed into u to allow all of the ones of the completedcodeword to appear on u. For example, coded bit positions 0, . . . , 3of the codeword position, can be mapped to any set of permutatedlocations, π(nj+0), . . . , π(nj+3) without the possibility of havingany weight d_(o)=4 codeword of the OBC of Table 1 complete. This isbecause [1 1 1 1 0 0 0 0] or any other 8 bit sequence with fewer than 4ones in the first four positions is not a codeword of the (8,4) OBC ascan be seen from Table 1. However, when the fifth bit of the codewordposition, c_(j), is mapped to π(nj+4), then all of the ones needed tocomplete codeword #13, i.e., [1 1 0 1 1 0 0 0] will have been mappedfrom codeword position, c_(j), to u. In general a maximum of, μ numberof bits can be placed from a given codeword position, c_(j), withoutcompleting any codeword of the OBC. Checks to ensure that Constraint 5is satisfied must be made when placing the remaining n−μ coded bits ofc_(j).

For the (8,4) OBC of Table 1, μ=4. Hence, a suitable ordering of tε{0, .. . 7} can be selected to allow a maximum of μ=4 coded bit positions tobe placed into u freely and without restriction. The remaining n−μ=4coded bit positions from each codeword position need to be checked toensure Constraint 5 is satisfied. Note that the process of “placing”coded bits involves finding, one by one, a permuted ordering of thecoded-bit locations in c to define the corresponding permuted orderingof coded bit positions in the vector u.

The CI-4 interleaver design process can be started off bypseudo-randomly selecting and placing μ·ρ coded bits that can be placedin u without any restriction. Next, the CI-4 design process can proceedby randomly selecting a remaining coded bit in c, one at a time to beplaced into the sequence u. At this point codewords will start tocomplete and care should be taken to ensure that Constraint 5 issatisfied. Note that in a brute force approach, there would be a largenumber of combinations of the OBC to consider to ensure that Constraint5 is satisfied. This number of combinations

$\sum_{n_{i} = 1}^{N_{c}}\begin{pmatrix}\rho \\n_{i}\end{pmatrix}$

is very large, especially at higher values of ρ. The CI-4 interleaverdesign algorithm presented below reduces the complexity greatly by onlyevaluating the relatively few codeword combinations that can potentiallygive rise to low weight CTBC codewords, v. In the process ofsequentially placing the coded bits of codeword positions c_(j)=(c_(j1),c_(j2), . . . c_(jn)), for j=0, 1, 2, . . . , ρ−1, consider theplacement of a coded bit of codeword position c_(j) on u that will endup completing one or more valid codewords in Table 1. At the point intime of placing each coded bit, it is assumed that any and all of thepreviously placed coded bits were placed into u in such a way as to meetConstraint 5. This condition is clearly met after placing the first μ+ρcoded bits as described above, but from then forward, additional careneeds to be taken to avoid placing any bit in a “restricted zone,” i.e.,into any range of one or more locations that would cause Constraint 5 tobe violated. As u fills up beyond the first pp coded bits, a list L ofalready completed codewords is preferably maintained which includes theidentification numbers in Table 1 of each completed codeword along withthe positions of their respective coded bits in u. By the end of theCI-4 interleaver design process, the list L will have grown to includeρ*2^(k) entries, containing all codewords of the OBC mapped from all ofthe codeword positions, c_(j)=(c_(j1), c_(j2), . . . c_(jn)), for j=0,1, 2, . . . , ρ−1.

Once μ bits from codeword position c_(j), have already been placed, whenplacing a “current coded bit” position (nj+t) of the “codeword positionc_(j), tΣ{0, . . . 7}, care needs to be taken to assure that Constraint5 is met for each codeword in Table 1 that completes upon this bit'splacement. That is, for each of the codewords that complete, inaccordance with Constraint 5, identify any restricted zones in u wherecoded bit (nj+t) cannot be placed due to the codewords that currentlycomplete due to the placement of bit (nj+t). In addition, combinationsof other already completed codewords from other codeword positions otherthan c_(j) need to be considered to determine if additional restrictedzones in u exist due to combinations of the currently completingcodeword(s) with other already completed codewords from differentcodeword positions, e.g., c_(j2).

Consider the example of FIG. 4A where placing a new bit of a singlenon-zero codeword into u is considered. Define the set S₁ to be the setcontaining the already placed bits of each codeword on the list L thatwill complete when the currently being placed bit, (nj+t), is placed. Asshown in FIG. 4A, for the (8,4) OBC as enumerated in Table 1, there willbe d₀−1=3 ones of any such a codeword that will be completed, arealready placed for any given weight d₀ codeword in the set S1. Theindices of the already placed ones of the completing codeword can bereordered as a₁<a₂ . . . <a_(d0-1). In the d₀=4 example of FIG. 1, thisis a₁<a₂<a₃. Also shown in FIG. 1 are the “separations,” S₁ and S₂, thatexist between the already placed ones of the weight d₀ codeword in Table1 that will complete once bit (nj+t) is placed into u. A worst casecondition occurs when the currently being placed bit is mapped tolocation “A” in FIG. 4A, forming a doublet with a₁, leading toW[v]=1+S₂. Similarly if the currently being placed bit is mapped tolocation “B” in FIG. 4A, forming a doublet with a₃, then W[v]=1+S₁. Ifthe currently being placed bit is mapped to form a doublet on eitherside of a₂, then W[v]=S₁+S₂−1. In general the odd numbered indices giverise to the worst case conditions and the even numbered indices need notbe evaluated separately to identify the restricted zones. Algorithm 1 asdescribed below can be used to find any restricted zones for placing thebit (nj+t) at a location π(nj+t) in u. When it is desired to identifyrestricted zones based on a currently completing weight d₀ codeword byitself, Algorithm 1 below is called with a “positions vector,” given byp=[a₁, . . . a_(d0-1)]. The positions vector, p, is an ordered set ofindices where the streams of ones in v will potentially begin andterminate. In the example of FIG. 4A, p=[a₁, . . . a₃]. Because thenumber of elements in p is odd, no adjustment (as discussed below) isneeded, so set p_adj=p and call Algorithm 1 with p_adj andLengthP_adj=3.

  Algorithm 1 Given that d_(t), p_adj and LengthP_adj have beenspecified:  1. Set local variables, p =p_adj and LengthP=LengthP_adj. 2. For i=1, . . . , LengthP, compute S_(i)=[p(i+1)−p(i)].  ${3.\mspace{14mu} {Initialize}\mspace{14mu} S_{0}\text{=}\; 0},{{X(1)}\text{=}0},{{Y(1)} = {\sum\limits_{{i = 2},{({{even}\mspace{14mu} i})}}^{LengthP}\; {S_{i}.}}}$ 4. For i=1, 3, . . . LengthP, (odd i):  a. X(i+1)=X(i)+S_(i-1)  b.Y(i+1)=Y(i)−S_(i+1).  5. Compute w(i)=d_(t)−X(i)−Y(i)−1 for i=1,3, . . .LengthP, (odd i).  6. If w(i)≦0, there are no restrictions on eitherside of p(2i−1).  7. If w(i)>0, restrictions are needed around p(2i−1).If w(i) extends up to or beyond the next or previous element of q, thenthe restricted zone is the entire region between p(2i−1) and the near-byelement. If w(i)>0 and w(i) does not extend up to a near-by element of pon one or both sides, a window of width w_(RZ)=w(i) of p(2i−1)identifies a restricted zone in the direction(s) where there is nonear-by element.

To better understand Algorithm 1, consider FIG. 4B. Note that when thenew bit is placed at C=π(nj+t), forming a doublet with p(3), thatW[v]=S₁+1+S₄. In Algorithm 1, the X(i) terms account for the precedingodd numbered distances and the Y(i) terms account for the succeedingeven numbered distances when considering the placement of the new bitnear an interior odd-numbered point like p(3). Only the odd-indexedvalues in p(i), e.g., p(2i−1) need to be checked for troublesome areasto place the new bit on u. The even numbered p-indices are lessimportant because a stream of ones in v ends there as opposed tobeginning there when π(nj+t) is selected to be near an even-numberedp-index.

An adjustment will be needed prior to calling Algorithm 1 for cases whenthe length of the position vector, p, is even. For example, considerFIGS. 4A and 4B, where p=[a₁, . . . a_(d0-1)], but when d₀ is odd.Specifically, suppose that d₀=3, in which case, d₀−1=2, andLengthP=d₀−1=2, and p=[a₁, a₂]. To make the adjustment create an“imaginary partner,” which, in this example is, a_(d0)=a₃=ρn−1. Thisimaginary partner is used because a stream of ones in v will start atthe location of the d₀ ^(th) one placed in u and this stream of oneswill terminate at the end of vector v, i.e., at the end location, ρn−1.Because of this additional element in p, append this element to p toform p_adj=[p, ρn−1], and LengthP_adj=d₀=3. This way, when Algorithm 1below is called, the adjusted position vector, p_adj, will have an oddnumber of elements as required by Algorithm 1. This type of adjustmentis also made for any combination of codewords that causes the positionsvector, p, to have an even number of elements.

To identify combinations of two codewords that potentially can give riseto low weight vectors, v, again consider the limiting worst caseconditions. Assume that an already placed codeword exists whose orderedindices can be written as {b₁, b₂, b₃, b₄} as illustrated in FIG. 5A.The information about this already completed codeword will already havebeen added to the list L at the time this codeword had completed.Similar to FIG. 4A, assume that the currently being placed bit, (nj+t),is to be placed in such a way so as to complete a weight d₀=4 codewordof Table 1. Again, the three already placed ones of this codeword haveordered indices, {a₁,a₂,a₃}. In this example, worst case conditionsoccur when one or more bits of {a₁, . . . a₃} form doublets with eachother and/or with the already placed bits, {b₁, b₂, b₃, b₄}. Asillustrated in FIG. 5A, in the worst case condition, the 241 alreadyplaced bits from the two codewords form d₀−1 doublets and leave one oddbit out that is not paired up. In this case, a window can be formedaround the remaining unpaired bit, e.g., a₃, and this window isw₂=[d_(t)(d₀−1)]. In the example of FIG. 5A, this window separationbetween a₃ and a₄ will ensure that W[v]=(3+(|a₄−a₃|)≧d_(t).

In the construction of sets as described in the various paragraphsbelow, the actions are to be carried out separately and as many times asis needed to account for each codeword that will be completed by placingthe coded bit (nj+t). Also, the elements of each set, S_(i), willgenerally contain the identities from the list L of each codeword thathas already completed and that will be used in a combination with thecodeword currently under consideration that will complete due to theplacement of coded bit (nj+t). Subsets of such elements are addedseparately for each codeword that completes due to the placement ofcoded bit (nj+t). The elements of each set, S_(i), can be considered tobe vectors of (i−1)-tuples of previously completed codewords, where eachelement of the tuple corresponds to a corresponding already completedcodeword on the list L. From each set, S_(i), can be constructed acorresponding set of position vectors, {p}_(i), which will be evaluatedby Algorithm 1 to find any restricted zones due to combinations of eachcurrently completing codeword with already completed codewords on thelist L that can potentially form the low-weight combinations that needto be avoided by to satisfy Constraint 5.

The above example regarding FIG. 5A illustrates a more general conditionof when an already placed codeword from codeword position c_(j0) needsto be evaluated when placing a bit of a current codeword position c_(j).A set, S₂, is defined to contain all of the completed codewords fromother codeword positions, c_(j0), that need to be evaluated when placinga coded bit from current codeword position c_(j), that causes one ormore current codewords to complete. To identify codewords for inclusionin the set S₂, check to see if at least one of the already placed onesof an already placed codeword from the list, L, falls within a window ofw₂=[d_(t)−(d₀−1)] bit locations on either side of any of the alreadyplaced ones from each codeword to be completed due to the placement ofbit (nj+t). Once the set of codewords in the set S₂ has been identified,Algorithm 1 can be called to identify all of the restricted zones, ifany, for all the identified combinations of codewords in the set S₂ withthe current codeword being completed by the placement bit (nj+t) into u.If no already completed codewords from the list L were found within thespecified windows, w₂, around the already placed bits of a selectedcurrently completing codeword, then the S₂ will be empty.

In the context of FIG. 5A, a positions vector, p, for the case of twocodeword combinations can be formed. Recall that the positions vector isan ordered set of indices where the already placed ones occur. In theexample of FIG. 5A, reading the indices in order from left to right,p[b₁,b₂,b₃,a₁,a₂,b₄,a₃]=[p₁, . . . p_(2d0−1)]. Typically, the worst casecondition will not occur, and many of the doublets shown in FIG. 5A willnot be doublets but will have separations such as s₁, s₂, . . . , s_(m)between them, where in the example of FIG. 5A, m=2d₀−2 (the length ofthe positions vector minus 1, see FIG. 4B). To find the set ofrestricted locations for the example of FIG. 5A, call Algorithm usingthe above indicated positions vector, p.

Similarly, FIG. 5B illustrates how windows can be used to capture thepotentially problematic combinations of three codewords and identify therestricted zones on u that cannot be used to place the current bit,(nj+t). In the example of FIG. 5B assume that the bits of two alreadyplaced codewords are given by b₁< . . . <b₄, and c₁< . . . <c₄, and notethat none of the already placed bits a₁<a₂<a₃ of the current codewordpaired up with any of the ones of the second already placed codeword,i.e., the already placed bits c₁< . . . <c₄. By counting the number ofdoublets that formed in the worst case type situation for the threecodeword case as illustrated in FIG. 5B, the window width when checkingfor combinations of three codewords the maximum widow size can belowered to w₃=[d_(t)−(3d₀/2)−1)]=[d_(t)−6−1]=[d_(t)−5] on either side ofa₃.

In the context of placing a t^(th) bit of a current codeword to satisfyConstraint 5 for combinations of three codewords, define the set, S₃, ascontaining the indices of all the already placed bits of each 2-tuple oftwo codewords that can potentially form low weight vectors, v, whencombined with each codeword that completes when the current bit, (nj+t),is placed. The definition of the set S3 can be considered a next actionof an iterative set construction where each element of each set containsthe combinations of other already placed codewords that need to bechecked in combination with the currently being placed bit of a currentcodeword. That is, the elements of the set S₃ contain the indices of the2d₀ already placed ones of the two codewords that need to be checkedalong with the d₀−1 already placed bits of the current codeword. Thed₀−1 already placed bits of the current codeword can then be appended toeach codeword 2-tuple contained in the set S3, and these indices can besorted to form a set of positions vectors, {p}₃. The position vectors in{p}₃ can each be sent to Algorithm 1 with ni=3 to find any restrictedzones that come up due to all possible combinations of three codewords.To identify each 2-tuple of codewords to be included in S₃, first lookfor codewords on the list L of completed codewords that fall within thewindow w₃ around already placed bits of each codeword of Table 1 thatcompletes due to the placement of bit (nj+t). Call this set S₂′ becauseit looks like the set S₂ but uses the smaller window, w₃. For example,in FIG. 5B, bits b₃, and b₄ are in S₂′ because they fall within thewindow w₃ around a₁, and a₂. Now identify all possible distinctcombinations of two codewords in S₂′ and include them as 2-tupleelements in the set S₃. If there is only one codeword in S₂′ then do notadd any 2-tuples to S₃ at this time. If there are no codewords in S₂′stop, because the set S₃ (and all higher order sets S_(i), for i=3, . .. , Nc) will be empty. Because the 2-tuples will later be sorted alongwith the already placed bits from the currently being completedcodeword, the ordering of the codewords in the 2-tuples does not add newcombinations to the set S3. That is, a “distinct tuple” is defined thatmust differ by at least one codeword on the list L to be considereddistinct. Also, a distinct tuple will never include the any givencompleted codeword from the list L more than once. While the aboveprocess identified a subset of the set S3, there are still more elementsthat need to be looked for possible inclusion in the set S₃.

Next consider an additional type of element for possible inclusion inthe set S3, called a “chained tuple” of codewords. To understand theconcept of chained tuples of codewords, consider the low weight exampleof FIG. 5B and note that none of c₁, . . . ,c₄, fall into the window w₃around any of the already placed bits a₁,a₂,a₃. However, b₃, and b₄ do,and c₃ and c₄ fall within w₃ of b₁ and b₂. Hence the chained tuple [(b₁,. . . , b₄), (c₁, . . . , c₄,)] needs to be considered when placing thefinal bit of the currently being completed codeword whose already placedbits are at a₁,a₂,a₃.

To identify the chained tuples for possible inclusion into the set S3,perform the following steps for each codeword in the set S₂′: 1) Selecta current codeword under consideration from the set S₂′. 2) Create a newset of windows around each of the coded bit positions in u of thecurrent codeword in S₂′ under consideration. 3) For this currentcodeword under consideration, identify all of the other alreadycompleted codewords from the list L that have at least one coded bitpositioned within the window, w₃. 4) For each such identified codewordfrom the list L, identify a 2-tuple of codewords consisting of thecurrent codeword under consideration and the newly identified codewordfrom the list L with at least one bit within the window, w₃. For eachchained tuple found that is not already in the set S3, add it to the setS3. At the end of this process the set S3 will be complete. If nochained tuples were found, no new elements will be added to the set S3.If S₂′ only had one element and no chained tuples were found, then theset S3 is empty.

In the context of placing a t^(th) bit of a current codeword to satisfyConstraint 5 for combinations of i>3 codewords, define the set, S_(i),as containing the indices of all the already placed bits of the(i−1)-tuples of codewords to be considered in combination with a currentcodeword when placing bit (nj+t) of the current codeword. If the setS_(i−1) is empty, then the set S_(i) and all subsequent sets will alsobe empty. In combinations of i codewords, (i*d₀/2)−1 doublets can beformed so the worst case window is thus w_(i)=[d_(i)−(i*d₀/2)−1)]. Toidentify each (i−1)-tuple of codewords to be included in S_(i), aniterative process is used that begins by first looking for codewords onthe list L of completed codewords that fall within the window w_(ni)around coded bits of the current codeword. Also call this set S₂′because it looks like the set S₂ but uses the smaller window, w_(ni).Start by including any and all distinct (i−1)-tuples of codewords in S₂′into S_(i). If the set S₂′ is empty, stop, because the set S will alsobe empty. If the set S₂′ contains less than i−1 elements, no(i−1)-tuples of codewords will be formed or added to the set S_(i) atthis time.

Next construct a set S₃′. Start by including all of the distinct2-tuples, if any, of codewords in the set S₂′ into the set S₃′. As inthe construction of the set S₃ as previously discussed, next identifychained 2-tuples of codewords using the codeword(s) in S₂′ to identifythe other potentially existing codewords from the list L within thenewly introduced windows of size w_(i). Add any distinct chained2-tuples found to the set of S₃′. If the set S₃′ is empty, stop, becauseS_(i), will also be empty. Next form the set S₄′ by including any andall distinct 3-tuples formed as combinations of a codeword in S₂′ with a2-tuple in S₃′. Next the chaining is applied as discussed in the contextof S3 to form chained 3-tuples for further inclusion into S₄′. Thisprocess of forming tuples and chaining is iteratively continued untilthe set S_(i) is reached or until the set S_(i) is found to be empty.The elements of the set S_(i) will be (i−1)-tuples of the indices of thelocations of the ones of i−1 other completed codewords to be consideredin combination along with the indices of the d₀−1 already placed ones ofthe currently being placed codeword.

That is, the elements of the set S_(i) are used to construct a set ofvectors, {p}_(i), where each vector p in this set contains an orderedlist of the indices of all of the already placed ones from the currentcodeword along with the indices of all of the already placed ones fromi−1 additional completed codewords that need to be considered incombinations of i codewords to ensure Constraint 5 is satisfied.

As illustrated in FIG. 5C, in the case where d₀ is odd, there is anothertype of condition that can give rise to low weight combinations ofcodewords. Note that while a₁ and b₁ form pairs, there are no elementsof {c₁,c₂,c₃} in the vicinity of the currently being placed codeword,whose already placed ones are located at {a₁,a₂}. Hence it is possibleto randomly place a bit at π(nj+t) to avoid the window shown around a₂which was placed in light of b₁ and b₂, but to place π(nj+t) near c₃without restriction. However, a low weight pair {a₃, c₃} would causecauses a problem due to the existence of the doublet {c₁,c₂}. Moregenerally, even when d₀ is even, there is a possibility that new lowweight combinations can be formed due to chaining from as yetnon-analyzed already placed bits located near to the chosen placementpoint, π(nj+t).

In order to ensure that a situation like that shown in FIG. 5C or themore general case does not happen, the concept of a tentative placementis introduced. That is, in the case where d₀ is odd, when a bit isfinally randomly placed in accordance with all of the above mentionedchecks in order to meet Constraint 5, an additional set of windows, w₂,w₃, . . . w_(ni) whose widths are described above, need to beadditionally checked on either side of where π(nj+t) is tentativelyplaced. Then during these additional checks, any newly identifiedcompleted codewords from L should be added to sets S2, S3, etc. Notethat if no such codewords are added to S2, then no codewords will beadded to any other set as the window widths gradually decrease from w₂.If no such codewords are added, accept the tentative position,π(nj_(c)+t). If one or more such codewords are found within the newlyintroduced windows, without placing the coded bit at π(nj+t), repeat theabove described process with the newly found combinations that includeat least one or more newly added codewords along with the already placedcoded bits of the codeword currently being placed. This process can findadditional positions to be restricted due to the newly added codewordsto the sets S₂, . . . S_(ni). If the π(nj+t) is not one of the newlyrestricted positions, accept π(nj+t) and place that coded bit of thecurrently being placed codeword at π(nj+t). However, if the π(nj+t)happens to be one of the newly restricted positions, then π(nj+t) shouldbe discarded and a new tentative position needs to be randomlyidentified among the non-restricted and available remaining locations onu. In order to find a new tentative position, add these newly foundpositions to be restricted to the set of restricted positions andrandomly select a new position among the available remaining positions.Once such a new tentative position is identified, repeat the processuntil an acceptable position for the coded bit of the current codewordhas been found.

It follows from the above discussion that in order to find a position inu, π(nj+t), where a currently being placed coded bit, (nj+t), in c canbe placed in accordance with Constraint 5, perform the followingactions:

1. For i=1, . . . , Nc,

-   -   a. Identify set S_(i), and stop early if any S_(i) is empty.    -   b. Construct the corresponding set of position vectors, {p}₁.    -   c. For each pε{p}_(i) determine LengthP, and:        -   i. IF LengthP is even, set p_adj=[p, ρn−1], and            LengthP_adj=LengthP+1 ELSE set p_adj=p, and            LengthP_adj=LengthP.        -   ii. Call Algorithm 1 using p_adj and LengthP_adj.

2. Take the union of all restricted zones found by Algorithm 1 in action1 above.

3. Randomly select π(nj+t) among the remaining available positions.

4. Open a sequence of windows of widths w₂, w₃, . . . w_(Nc) around theπ(nj+t). Identify any completed codewords on L within this newlyintroduced set of windows. If no such codewords are found, place bit(nj+t) at π(nj+t). If completed codewords are found within this newlyintroduced window, do action 5.

5. Without placing the coded bit in question at π(nj+t) do action 1 byaugmenting each set S_(i) for i=2, . . . Nc, with any newly identifiedelements due to the newly opened windows of action 4. Use Algorithm 1 toidentify the newly selected positions to be restricted from this run. Ifπ(nj+t) is not in the newly found positions to be restricted, stop. Ifπ(nj+t) is in the set of newly found positions to be restricted, add thenewly found positions to be restricted to the set of positions to berestricted and repeat actions 4-5 until a permutation position that canbe confirmed is found. Once a position is conformed, place the coded bitat the final π(nj+t).

6. In the event that the algorithm is having difficulty, perform aroll-back and continue. A “roll-back” is defined as undoing (andeventually re-placing) any number of already placed positions. A recordis preferably kept as to the most recently placed positions so that theroll-back can remove any desired number of most recently placedpositions. When these positions are removed, the restricted zones due tothese already placed positions that are being undone will be have beenremoved when placing subsequent positions and the randomization in theplacement process will provide opportunity to bypass the current problemthat caused the roll-back to occur. Alternatively, positions that havecaused a proportionally large number of restricted zones to appear whileplacing subsequent positions can be removed. A respective list ispreferably kept for each respective placed position. Each respectivelist identifies all of the restricted zones that had to be removed forsubsequent positions being placed due to the respective already-placedposition. This allows the most troublesome positions to be intelligentlyselected for removal/unplacement in the roll-back. Further aspects ofintelligent roll-backs are discussed below. Also discussed below is RCID(reverse constrained interleaver design), which is a general method ofperforming intelligent roll-backs, i.e., ensuring a set of interleaverconstraints are met for an interleaver whose positions are alreadyplaced, but may not already meet the interleaver constraints.

The above approach selectively considers only the necessary combinationsof codewords to place coded bits on u. It is seen that the complexity ofthis method increases with increasing d_(t) values. Even though theabove approach still considers different combinations of codewords, thecomplexity of the above outlined algorithm is much lower than searchingover all possible codeword combinations of all of the ρ codewords.

Referring now to FIG. 6, a design method 600 is provided to designvarious types of L=1 constrained interleavers, π_(Cl−L=1):c→u, such asare used in block 311 of FIG. 3. The design method 600 can be used todesign CI-3, CI-4 or mixed CI-3/CI-4 interleavers that implement all ofConstraints 1-5. In the context of deterministic interleavers versusrandom interleavers, the method 600 is primarily used to design randomconstrained interleaver (“RCI”) implementations of constrainedinterleavers that typically rely upon lookup tables offrame-size-specific state machines and state-transition logic. Themethod 600 is first described for use in designing a CI-4 interleaver.

To understand the action 602, recall that the number μ represents themaximum number of coded bit positions that can be placed from eachcodeword position, c_(j), without completing a codeword. That is, forany fixed j, indicative of the set of coded bit positions (nj+t), thevariable t may take on μ different selected values, {t}_(μ)⊂{0, . . . ,n−1}, (called a “μ-subset”), to identify a corresponding subset of μdifferent coded bit positions, {(nj+t)}_(μ) that can be placed into anyj^(th) codeword position without completing any codeword of the OBC. Forexample, when the (8,4) Hamming code of Table 1 is used, μ=4, and{t}_(μ)={0, 1, 2, 3} or {t}_(μ)={4, 5, 6, 7} represent valid μ-subsetsbecause if four ones were placed into the coded bit positions {(nj+t)}using the four t-values from either of these two μ-subset, no codewordin the (8,4) Hamming code of Table 1 would complete. Depending on thecode, there will be a fixed number, of valid μ-subsets that applies toall codeword positions, c_(j). The action 602 generates a μ-set byselecting a respective μ-subset to be used in each codeword position.For example, if there are a total of 8 valid μ-subsets in a given code,then the action 602 could use a randomly selected one of these μ-subsetsfor use in each of the coded bit positions, c_(j). With L=1 constrainedinterleavers, a CTBC code is typically constructed using ρ different(n,k) codewords of the OBC, and the frame size is K=ρn coded bits.Therefore, any complete μ-set of coded bit positions will contain up toa total of μ*ρ coded bit positions.

When the CI-4 design approach is in use, the action 602 preferablygenerates a complete μ-set, {(nj+t)_(μ): j=0, . . . , ρ−1}, with ttaking on the μ different values in each j^(th) selected μ-subset. Oncethe action 602 identifies a complete μ-set, a variable “PLACED” is setto the number of elements in the μ-set, e.g., PLACED=μ*ρ. In someembodiments, in order to provide additional flexibility in mapping, theaction 602 creates a partial μ-set in order to loosen the requirementslater while placing coded bits subject to the constraints. Whether acomplete or incomplete μ-set is selected, the action 605 identifies acorresponding set of permutation locations, {π(nj+t)_(μ): j=0, . . . ,ρ−1}, for the μ-set selected in the action 602. The “non-constrained”portion of action 605 refers to Constraint 5. Constraint 5 need not beevaluated while placing the bits in the action 605 because it isguaranteed that no codewords of the OBC will complete during theplacement of these bits.

Control next passes to action 610 where one or more (Δ) remaining codedbit positions (i.e., Δ bit positions that have not yet been placed),{(nj+t)}, are selected. Often, Δ=1. In preferred embodiments, theselection 610 is generated using a pseudo random number generator thatgenerates an index into a vector that includes all the indices of thebits that have not yet been placed, although other selection criteriacan be used. When Δ>1, if a particular selected coded bit position,(nj+t), cannot be placed at a particular location, π_(candidate), then adifferent selected bit position being analyzed at the same time can bechecked to see if it can be permuted to the location π_(candidate). Thisway, the action 610 can work to avoid or resolve potential conflicts andprovide further flexibility in finding a valid permutation function,π_(Cl−L=1):c→u.

Control next passes to an action 615 which performs an analysisfunction. Action 615 identifies any and all restricted zones associatedwith placing the selected bit, (nj+t). If a plurality of coded bitpositions, {(nj+t)} (a mapping group”) have been selected in the action610, then all of the restricted zones associated with placing each ofthe plurality of coded bit positions are preferably identified in theaction 615. In a preferred embodiment, the action 615 generates a set ofpositions vectors for each of the one or more coded bit positions in themapping group and passes these positions vectors to Algorithm 1 in orderto identify their respective restricted zones. As discussed below,different target MHDs may optionally be used in Algorithm 1 depending onLengthP of each positions vector.

Next control passes to an action 620 which starts by identifying one ofmore candidate permutation locations. In embodiments where the action615 only selects one coded bit to be placed at a time, i.e., where eachmapping group has only one coded bit, the action 620 identifies acandidate permutation position, π(nj+t), in which to place the coded bitposition, (nj+t), from c to u. The permutation location, π(nj+t), isselected to be outside of any restricted zones identified for theselected coded bit position, (nj+t). As described above in connectionwith the CI-4 design algorithm, one or more verification actions arenext taken to ensure that, once placed, no new constraint violationsoccur. This verification is preferably performed by opening a set ofwindows around the candidate bit placement location, π(nj+t), anddetermining whether any already completed codewords on the list L haveany bits within these new windows. If not, the bit (nj+t) can beverified to be placeable at the candidate bit placement location,π(nj+t). If one or more coded bit positions from already placedcodewords are found to be in the new windows, Algorithm 1 is preferablyused to identify any new restricted zones. Next it is determined whetherthe candidate bit placement location, π(nj+t), is located within any newrestricted zones. If not, the action 620 performs the placement(nj+t)→π(nj+t) and declares this placement to be verified. If theplacement cannot be verified, then a new candidate placement is selectedoutside all identified restricted zones and the process is continueduntil a verified location can be found. If, for example, toward the veryend of the method 600, no location π(nj+t) can be found to beverifiable, then a roll-back procedure as described above is invoked andthe method 600 is reentered at action 610 using the rolled back state ofthe method 600. In embodiments where the mapping group consists of asingle coded bit position, then Δ=1, and control passes to action 625where the variable PLACED is incremented by one.

In embodiments where the mapping group has more than one element theactions 610-625 perform additional functions. For example suppose thatthe action 610 selects a mapping group, {(nj+1)}, that contains tendifferent candidate coded bit positions from different codewordpositions, c_(j), to be mapped together as a group. Then the action 610would identify the ten different coded bit positions in the mappinggroup. For each coded bit position in the mapping group, the action 615would identify a respective set of positions vectors and would useAlgorithm 1 to identify ten respective sets of restricted zones. Nextthe action 620 would observe the ten sets of restricted zones andanalyze additional information, such as overlapping restricted zones andzones where none of the bits in the mapping group had any restrictions.Such additional information could be used in a mapping group placementstrategy to more intelligently place one or more of the coded bits inthe mapping group. For example, permutation positions located outside ofthe union of these restricted zones would likely lead to verifiableplacements. Also, for example, if the restricted zones of nine the codedbit positions overlapped in a certain “crowed area(s),” but the tenthcoded bit to be placed did not, it may be desirable to place the tenthcoded bit position into an identified crowded area in order to fill adifficult position. The mapping group placement strategy is preferablyorganized to increase a measure of performance such as the probabilityof finding valid CI-4 interleaver solutions by eventually being able toplace all of the K bits in the frame.

In the example where the mapping group has ten coded bit positions,suppose that a candidate target location in u, π_(candidate-1), isselected by a random number generator during the action 620. In thisexample, assume that π_(candidate-1) is not in any restricted zone ofsix of the ten coded bits positions in the mapping group, {(nj+t)}. Thenthe verification portion of the action 620 could be carried out forthese six coded bits positions. Suppose that three of these six codedbit positions were verified to be placeable at π_(candidate-1). Thisinformation can be recorded, another candidate permutation location, forexample π_(candidate-2) could be similarly analyzed. This analysis canbe continued up to π_(candidate-10). Now, with the knowledge of theverifiable placements and the interactions between the different codedbit position in the mapping group, the action 620 can determine anordering in which to make the placements and final verifications of thecoded bits in the mapping group. For example, the action 620 canrecursively perform tentative placements and verifications among theremaining bit positions to be placed based on the first pass analysisabove. The action 620 continues analyzing the effects of differentplacement strategies until all of the bits of the mapping group havebeen placed, in which case, the parameter Δ is set as Δ=10. Also, theaction 620 preferably maintains data records indicating placements thatcould have been made but were not selected. This information can laterbe used if and when a roll-back is needed. When the method 600 is nearcompletion, it is possible that Δ<10 placements can be made for a givenmapping group. In such cases the roll-back process discussed above canbe invoked and the method 600 reentered at action 610, or Δ<10placements can be made and the parameter Δ can be set to the number ofcoded bit positions that have actually been placed. Control then passesto action 625 where the variable PLACED is incremented by Δ.

In a one type of embodiment, the mapping group can be selected to be allof the remaining bits to be place outside of the originally selectedρ-set. In such an embodiment, the action 615 analyzes all possible validplacement positions for each remaining coded bit outside the ρ-set. Nextcomputer-chess forward looking trellis logic is used whereby eachplacement is considered to be a “move.” Using the same type of gametheory forward looking analysis as is used in computer chess games, theaction 620 could analyze all sequences of “moves” and identify asequence of “moves” that caused the method 600 to “win” the game, i.e.,to place all of the coded bits into a proper CI-4 interleaver design.While such an approach requires more computing time, such logic is wellknown, the method 600 will be carried out off line and the final resultpotentially used millions and millions of times in the future orpublished in a standards document. Also, the computer-chess forwardlooking trellis logic can be applied during roll-backs to just beapplied to a smaller portion of the placement problem within which thetrouble spots have been identified.

Control next passes from action 625 to action 630 which then passescontrol back to the action 610 until an error condition arises wherecertain placements cannot be verified, or until the condition PLACED=Kis met, in which case the CI-4 permutation vector, π, is supplied asoutput. If control passed to the action 630 because of an errorcondition (e.g., Δ=0), then a roll-back as discussed above is performedand the process 610-630 is continued until the condition PLACED=K ismet. Once this condition is met, the entire permutation vector, π, isoutput from the design algorithm 600.

The method 600 can also be configured to perform an “analysis run” thatdoes not place coded bit positions from c to u in actions 605 and 620.In analysis runs, the coded bit positions are assumed to already havebeen placed by a previous run of the method 600, so the method 600 isconfigured to only to identify and analyze restricted zones for aspecified MHD>d_(t). Analysis runs are used to identify a set ofpositions vectors (and their respective lengths) that correspond to lowweight CTBC coded sequences whose weights are d_(t)<d≦d_(f), for somespecified weight, d_(f). It is assumed that all CTBC coded sequencesd≦d_(t) will already have been eliminated in the previous run of themethod 600 that performed the placements subject to d_(t). If, asdiscussed below, multiple d_(t)'s are used in the method 600, then theset of weights identified in the analysis run, d_(t)<d≦d_(f), holds forthe lowest value of d_(t) used in the run of the method 600 thatperformed the placements. Analysis runs can be configured to provideadditional information such the restricted zones associated with each ofthe identified positions vectors in the higher weight regions,d_(t)<d≦d_(f). While the previous run of the method will have avoidedall restricted zones for all weights d<d_(t), there will be newrestricted zones that can be identified for remaining low weight CTBCcodes whose weights are in the range d_(t)<d≦d_(f).

The method 600 can also be configured to perform a CI-3 interleaverdesign, or a mixed CI-3/CI-4 interleaver design. First considerconfiguring the method 600 to perform a CI-3 interleaver design. Tostart, the action 602 is configured by setting the parameter μ to μ=1.This causes action 605 to place one bit from each codeword withoutconstraints. For example, action 605 can use a random number generatorto place a total of ρ coded bit positions, one from each codewordposition, from c onto u. Control next passes to action 610 which isconfigured to use a mapping group containing one bit, i.e., Δ=1.

In a CI-3 embodiment of the method 600, action 615 is configured toidentify the restricted zones due to Constraints 1-4. For Constraint 1,assuming a value has been specified for s₁, the restricted zones for acoded bit (nj+t) consist of all positions within s₁ locations in u awayfrom any already placed coded bits, π(nj+t) for tε(0, . . . n−1). ForConstraint 2, a list of pairs of codewords positions (c_(j), c_(j1))that have a coded bit from c_(j) and a coded bit from c_(j1) separatedby exactly (l_(max)+1) positions is maintained. When finding a positionfor a coded bit of c_(j), each the list element containing c_(j), as anelement of a pairs of codewords positions (c_(j), c_(j1)) is identified,and all positions within (l_(max)+1) from the remaining coded bitpositions of each c_(j1) on the list are added to the Constraint-2restricted zone for placing the current bit of c_(j). For Constraint 3,checks are performed to see that if a coded bit of codeword positionc_(j) and a coded bit of codeword position c_(j1) have a separation ofl, then when finding a position for a coded bit of c_(j), the Constraint3 restricted zones include all positions within (s₁−l) away from alreadyplaced other coded bits of c_(j1). For Constraint 4, neighboringcodewords of every coded bit on u are monitored. For each codeword forj=0, 1, . . . , ρ−1, and for each s=1, 2, . . . , (l_(max)+1), arespective list of neighboring codewords, Ln_(j)(s) is maintained whoselist entries identify all of the neighboring codewords of c_(j) in uthat have a coded bit at a minimum separation of s relative to any ofthe n coded bits of the codeword c_(j). When selecting a position for acoded bit position (nj+t) of codeword position, c_(j), the listsLn_(j)(s) are consulted. Suppose that c_(jx) is an entry of Ln_(j)(1),and c_(jy) is an entry of Ln_(jx)(1). Then when placing coded bitposition (nj+t), the Constraint 4 restricted zone includes one positionaround each coded bit of codeword c_(jy).

In the action 620, the coded bit position (nj+t) is placed at a selectedpermutation position, π(nj+t), that is outside of all restricted zonesidentified in the action 615. For example, a random number generator canselect π(nj+t) from among the remaining non-restricted positions. In theaction 625 the variable PLACED is incremented by Δ=1. In the action 630the end conditions are checked and control is passed back to action 610until a CI-3 interleaver is available. If needed, a roll-back canperformed as needed prior to returning control to the action 610. Uponcompletion the method 600 will provide as output a full CI-3interleaver, π.

To configure the method 600 to design a mixed CI-3/CI-4 interleaver, themethod 600 is instantiated twice, with one instantiation configured todesign a CI-3 interleaver as described above (“the CI-3 instantiation,”)and the other instantiation configured to design a CI-4 interleaver asalso described above (“the CI-4 instantiation.”) The two instantiationswill work on the same problem together and communicate and synchronizewith each other as described in the example embodiment providedimmediately below.

To start, the CI-3 instantiation is allowed to execute actions 602-630as described above until PLACED=μ*ρ, where the value of μ refers to thevalue of μ used in the CI-4 instantiation, e.g., p=4 when the (8,4)Hamming code is used in the OBC. Now that the entire μ-set has beenplaced in accordance with Constraints 1-4, action 610 of the CI-3instantiation is allowed to select the next coded bit position, (nj+t)to be placed. Action 615 of the CI-3 next identifies all restrictedpositions for all of Constraints 1-4 as described above. At this point,the coded bit position to be placed, (nj+t), and all the restrictedzones for Constraints 1-4 are passed to the CI-4 instantiation. The CI-4instantiation then executes action 615 using this selected coded bitposition (nj+t) and identifies of its CI-4 restricted zones usingConstraint 5. The CI-4 instantiation then takes the union of all CI-3and CI-4 restricted zones to form the final restricted zone for codedbit position (nj+t). Actions 620-630 are then performed just as in theCI-4 approach, except once control passes back action 610, the CI-3instantiation is allowed to take over and the cycle repeats this wayuntil the mixed CI-3/CI-4 interleaver, π, is available at the output.

To understand the concept of running the method 600 using multipletarget MHDs, consider an example where the method 600 has already be runto determine a CI-4 interleaver with a target MHD of d_(t). Next ananalysis run of the method 600 is subsequently run using a higherdistance value, namely d_(f)+1 where d_(t)≦d_(f), so that the analysisrun identifies all low weight CTBC codewords with weights d_(t)<d≦d_(f).In the analysis run, some of the additional information collected caninclude statistics that tabulate the low weight CTBC coded sequences andmonitor the number weight d CTBC codewords there are in each of theabove described categories of codewords. Specifically, the specificcategories of the CTBC codewords whose weights are in the ranged_(t)<d≦d_(f) are evaluated and their positions vectors, p(d), aretabulated in respective tables, P(d). Note that the principalprobability of error contributions are thus given by

$\begin{matrix}{P_{e,{df}} \approx {\sum\limits_{d = t}^{df}{A_{d} \times {P\left( {d,\gamma_{b}} \right)}}}} & (8)\end{matrix}$

where P_(e,df) denotes the error probability due to low weight CTBCcodewords whose weights are in the range d_(t)<d≦d_(f) and P(d, γ_(b))is the probability of decoding in favor of a CTBC codeword with a weightd error sequence at a bit signal to noise ratio of γ_(b)=E_(b)/N₀. Afurther granularity in the error coefficients, A_(d), can be discernedusing the statistics provided by the analysis run of the method 600. Theanalysis run preferably tabulates how many weight d CTBC sequences comefrom each of the categories of low weight error sequences as definedherein above, i.e., Φ₁, . . . , Φ₄ and Φ_(m) ^((CI-4)), for m=1, . . . ,N_(c), at each weight, d_(t)<d≦d_(f). Define thecategory-error-coefficient expansion, for each d_(t)<d≦d_(f), asfollows:

$\begin{matrix}{A_{d} = {\sum\limits_{{cat} = 1}^{{Nc} + 4}{A_{d}({cat})}}} & (9)\end{matrix}$

where the A_(d)(cat) values (“category-error-coefficients”) equal A_(d)times the percentage of low weight CTBC codewords at weight d, that comefrom, respectively, categories Φ₁, . . . , Φ₄ and Φ_(m) ^((CI-4)), form=1, . . . , N_(c), and divided by 100. In a CI-3 design only the firstfour A_(d)(cat) values can be non-zero, and in a CI-4 design only theA_(d)(cat) values for cat=5, . . . , N_(c)+4 values can be non-zero. Ina mixed CI-3/CI-4 design, all the A_(d)(cat) values in equation (9) canbe nonzero. As discussed earlier, in mixed CI-3/CI-4 interleaverdesigns, if a given positions vector determined in a CI-4 instantiationidentifies a low weight error vector that has already been identified asa member of any of categories 1-4, that positions vector would begrouped into its respective category 1-4 and not counted a second timein a category cat≧5. It can be noted that combinatorics couldalternatively be used to determine closed form expressions for each ofthe category-error-coefficients. With these definitions, equation (8)can be modified to take advantage of this additional information as

$\begin{matrix}{P_{e,{df}} \approx {\sum\limits_{d = {dt}}^{df}{\sum\limits_{{cat} = 1}^{{Nc} + 4}{{A_{d}({cat})}{{P\left( {{d({cat})}{,\gamma_{b}}} \right)}.}}}}} & (10)\end{matrix}$

where d(cat) is a separate target minimum distance, d_(t)<d(cat)≦d_(f),defined for each category of codewords. The values of d(cat) are used asthe multiple target MHDs in the method 600. The values of d(cat) areselected to lower the error probability of equation (10) below that ofequation (8) or to preferably minimize error probability of equation(10).

Note that the lengthP parameter sent to Algorithm 1 identifies eachpositions vectors as belonging to a certain category, Φ_(m) ^((CI-4)),for m=1, . . . , N_(c). Hence the d_(t) value used in Algorithm 1 can bechanged to d_(t)(LengthP) which correspond to the d(cat) values forcat≧5 in equation (10). Different d(cat) values can be used in each ofConstraints 1-5 so that each category of low distance error sequences ismade to give rise to a lower overall contribution in (10) from theidentified values of A_(d)(cat) and d_(t)(cat).

Reverse Constrained Interleaver Design (RCID):

In the CI-3 and CI-4 constrained interleaver design examples presentedso far, the constrained interleaver, π:c→u i.e., u=π[c], was created bysequentially placing coded bit positions, one at a time, from the codedsequence c into an initially empty interleaver vector u. The placementof the coded bits into the vector u was performed subject to a set ofinterleaver constraints that, when satisfied, ensure that a targetminimum Hamming distance, d_(t), will be maintained at the output of theIRCC. This process of sequential placement was continued until theinterleaver vector u was completely filled with K=nρ coded bits. RCID(reverse constrained interleaver design) takes the view that the codedbits from the vector c have initially been placed into the vector u.However, this initial placement may very well violate the interleaverconstraints. RCID then applies a systematic approach to rearrangecertain selected bits in u in order to arrive at a new vector u thatgives rise to a vector v that does satisfy the constraints and thusachieves the target minimum Hamming distance, d_(t). In the CI-3 andCI-4 constrained interleaver design methods, as the placement of codedbits from the vector c were sequentially placed into the vector u, insome cases it was indicated that roll-backs may have been needed. Asexplained below, RCID can also be used at the time it is determined thata roll-back will be required. RCID is then used to convert a currentinterleaver u that requires a roll-back into an interleaver u that meetsthe interleaver constraints.

The RCID method starts by assuming all of the positions have alreadybeen placed into the interleaver u and thus the u vector is initiallyfull. For example, the vector u may be initialized to the naturalordering of coded bits in the vector c by setting u=c. At this time theinterleaver, π, amounts to an identity transformation and the coded bitsof the concatenation, v=G[u]=G[π[c]], most likely violate the target MHDrequirement d_(t). The RCID approach remedies this situation by removinga predetermined set of coded bits from the initial vector u until allsub-distance error sequences, denoted i_(P<), with weights d<d_(t) areeliminated. This is preferably performed by removing the minimumpossible number from of bits u so as to prevent the sub-distance errorsequences, i_(P<), from completing. The act of removing already placedpositions from the vector u creates a set of “holes” in the vector uthat correspond to the now-vacated positions in u.

RCID next seeks to place the removed positions back in the interleaverin such a way as to achieve the target MHD, d_(t). An advantage of thisapproach is that there will be a smaller number of bits that need to beplaced thus making the interleaver design simpler. Also, using thisapproach, shorter interleavers (i.e., with a lower value of ρ) can beconstructed that meet the MHD requirement, d_(t). However, when RCID isused to construct minimally short constrained interleavers that meet theMHD requirement, d_(t), the interleaver π will only permute the bitsthat have been removed be and thus will not be nearly as random as thepreviously discussed CI-3 and CI-4 designs. Such RCID interleavers willthus sacrifice much of the interleaver gain. However, when RCID is usedto resolve roll-back conditions, this disadvantage is not the case. Noris it the case if the initial u vector is set as u=π_(rand)[c], whereπ_(rand) is a random interleaver.

To better understand RCID, consider a simple example. In this example,let the OBC be a (4,1) repetitive code that is formed by repeating eachmessage bit four times, and whose MHD is d₀=4. That is, the c vector isequal to a vector of message bits with each message bit repeated fourtimes. Therefore, each codeword involves a repetition of four messagebits, and each codeword requires four coded bits in order to complete.In this example, a concatenation v=G[u]=G[Iπ[c]] will be formed usingthe accumulator IRCC for the inner code, and the RCID interleaver willbe selected to achieve a target MHD of d_(t)=16. Note that this targetMHD corresponds to the maximum that can be achieved using a CI-2 sinceCI-2 can achieve d_(t)=d₀ ²d_(i). Using RCID, set the initial conditionu=c so that the interleaver π is initialized to the identitytransformation. In this example, the goal is to modify π (i.e., theordering in u) in a minimalistic way so that the concatenation,v=G[u]=G[π[c]], achieves d_(t)=16. To achieve this MHD, the followingsteps may be used:

1. Start by placing all 4ρ coded bits into u by setting u=c. At thistime the resulting MHD (from each single OBC codeword) is only 2 becausethe accumulator IRCC converts each sequence like “1111” into goes“1010”. Hence, changes are needed.

2. In order to ensure that none of the ρ codewords can complete, removeone coded bit from each codeword position. For example, remove the lastcoded bit position from every codeword position. This will result inremoving a total of ρ coded bit positions at locations congruent to 3modulo 4 (u(i), i Mod 4==3). Once these coded bit positions are removed,no codeword of the OBC will be completed in u. Hence, with theseremovals, there will be no sub-distance error sequences, i_(P<), andthus there will be no violations to the MHD objective, d_(t)=16.

3. Next the RCID approach seeks to place only the ρ number of removedcoded bit positions back into the ρ number of holes created in u, but ina different order. The removed positions needed to be placed back intothe holes in such a way as to achieve the target MHD, d_(t). In thisexample, this involves placing only ρ number of removed coded bitpositions as opposed to placing the entire set of 4ρ as is needed in theabove described CI-3 and CI-4 design methods. When placing this smallerset of ρ number of removed coded bit positions, the CI-3 or CI-4constraints as described above can be used. For example, if constraint 5is used, and considering a single codeword, with bit position 4 (i mod4=3) taken out of codeword position zero, if the coded bit positionremoved from this first codeword is placed at position u(i) where i≧19,then the concatenation, v=G[u]=G[π[c]], will achieve d_(t)≧16 for thissingle codeword error event. Similarly, any one or more of constraints1-5 can be checked/applied when placing the removed coded bit positionsback into u to account for low distance error events involving multiplecodewords as well.

In the above example, the parameter ρ need not be known ahead of time.Instead, the method can be carried out and the lowest value of ρ forwhich the MHD requirement can be met can be determined to be ρ_(min).This allows a minimum frame size, K_(min)=nρ_(min), to be determinedthat is needed to meet the target MHD. Also, it should be noted thatthis simple example was provided to illustrate the main RCID concepts.RCID could be applied to the (8,4) Hamming code as well. In the case ofthe (8,4) Hamming code, for example, RCID can be applied by startingwith u=c and then removing the last four bit positions from of eachcodeword position u. In this case, the vector u will hold the first fourcoded bit positions of each codeword position in natural order. Thesecond four bit positions of each codeword position will include bitpositions from other codeword positions, and in a more randomized order.However, this approach will lead to a lower interleaver gain due to lessrandomization as compared to the CI-3 and CI-4 design approaches. TheRCID approach may be desirable if it is desired to use as small of aframe size as possible to meet a given MHD requirement.

The RCID technique can be more generally be described as follows:

1. Starting with a given permutation, u=π[c], (where π may initially bethe identity transformation or an interleaver being designed using oneor both of the above-described CI-3 or CI-4 design methods, but at apoint in need of a roll-back, for example), determine a (preferablyminimal) set of positions that can be removed from u that will preventany sub-distance error sequences, i_(P<), from occurring, so that noviolations to the target MHD, d_(t), occur in the concatenationv=G[u]=G[π[c]]. This can be done by removing the minimum number of codedbit positions from u, or by removing bits in steps of a fixed number ofbits at a time, or any other sequential or grouped manner. Also,particular coded bits that have already been placed but are causing toomany restricted zones to be present in the remaining bits to be placedcan also be removed as a part of this first step.

2. Place the removed positions back into the holes created in u, but ina different order, such that the required MHD condition of theconcatenation is achieved. The re-ordering can be done by placing theremoved subject to a selected set of constraints. Alternatively theremoved coded bits can be placed back into a randomly selected hole andfollowed by checking to see if any constraint violations have been madeby that placement and only allowing valid placements. Similarly, anycombination of the above mentioned placing or swapping methods can beused that checks for and prevents any sub-distance error sequences,i_(P<), from occurring.

For example, consider how to apply RCID to perform a roll-back when theCI-3 and/or CI-4 design algorithm reaches a point where a roll-back isrequired. The RCID approach is preferably applied by considering all ofthe remaining positions to be placed into the holes created in step 1 ofthe RCID approach as outlined above. Additionally, already placedlocations that are identified to be giving rise to excessive restrictedzones for of the remaining positions can also optionally be removed aswell prior to the reordering and replacement step 2. Then the CI-3and/or CI-4 design approaches are continued as per step 2 to fill theholes and complete the CI-3 and/or CI-4 interleaver design.

If necessary, the interleaver gain in RCID can be improved by removingmore bits than are needed from u and then randomly selecting theirreordering to place them back subject to the selected interleaverconstraints. In this sense, the CI-3 and CI-4 design methods can beviewed as special cases of RCID where all of the coded bit positions areremoved from u and are then placed back to maximize the interleavergain.

Parallel Architectures with Deterministic Constrained Interleaver (DCI):

Interleavers are often required exhibit the “Contention free” property,also known as “vectorizable.” Such interleavers have the advantage thatthey can be efficiently implemented in decoder chips that employ a setof M parallel processing engines that are able to make repeated parallelaccesses to a bank of M parallel memories without any memory addressconflicts or memory contentions. For example, the LTE standard uses aQPP interleaver which is 8-way vectorizable, and LTE decoder chips areoften organized as 8-way parallel processing systems. OTN also usesM-way vectorizable interleavers and parallel processing chips, but, mostusually, with M>8 due to the very high data rates used in OTNapplications. In the context of constrained interleavers, the“contention free”/“vectorizable” property can be formulated as anadditional interleaver constraint. Herein, an “M-way vectorizeddeterministic constrained interleaver” corresponds to a DCI(deterministic constrained interleaver) that typically implements a SRCIsuch as CI-3 and/or CI-4, and also implements the vectorizationconstraint below. An M-way vectorized deterministic constrainedinterleaver also uses a deterministic pseudo-randomization function(such as the QPP or other deterministic interleaver). M-way vectorizeddeterministic constrained interleavers are preferably used intransmitters to generate CTBC codes that have vectorizable constrainedinterleavers. Also, a certain class of M-way vectorized deterministicconstrained interleavers are used in high speed real time parallelaccess/parallel processing implementations of SISO decoders as describedin more detail below. Also, a permutation is said to be deterministicand vectorizable if it meets the vectorization constraint below and canbe generated by a pre-determined DCI using one or more predeterminedmathematical formulas as discussed in further detail below.

Constraint 6 (“Vectorization Constraint”):

Given that the c and u vectors are of length K, in order for aninterleaver to be M-way vectorizable, Constraint 6 requires that thepermutation u=π[c] is selected to ensure that subsequences in c whoseelements are spaced by multiples of K/M positions apart are permutedinto re-ordered subsequences in u whose corresponding elements are alsospaced by multiples of K/M positions apart.

Constraint 6 can be better understood by considering an example memorysystem 710 arranged as an K/M×M matrix as shown in FIG. 7. Although inpractice the frame size K is usually much larger, FIG. 7 illustrates asmall example where K=40, M=8, and K/8=5. Note that the elements of thematrix 710 can be viewed as the indices of the elements of the vector cloaded into the memory 710 in column-major order. The quantity “K/M” ofConstraint 6 corresponds to the number of elements in each column of thematrix 710. The indices of the elements of each row, i_(row), can bewritten in terms of the individual coded bit positions of the vector c,written as c(i), where i=K/M*j_(col)+i_(row), j_(col)=0, . . . , M⁻¹,(i.e., “a subsequence in c whose elements are separated by multiples ofK/M′ as per Constraint 6).

Let [C]_(K/M×M) denote a K/M×M “vectorization matrix” into which thevector c is loaded in column-major order. Such a matrix C is shown asmatrix 710 in FIG. 7 with K=40 and M=8. The coded-bit-position index, i,into vector c can be written, i=K/M*j_(col)+i_(row). Therefore, giventhe index i, j_(col)=i DIV K/M and i_(row)=i MOD K/M, where DIV and MODare integer division operators for quotient and remainder respectively.When M is a power of 2, for example when M=2³=8, as shown in block 705of FIG. 7, any such address can be viewed in binary form as having “MSB”and “LSB” portions that respectively refer to the “most significantbits” and the least significant bits.” For example, when K=2^(x) andM=8=2³, the index, i, into the vector c can be written in terms of thebit positions of their binary addresses as [MSBs|LSBs]=[x−1, . . . 3|2 10], where the MSBs identify i_(row), and the LSBs identify j_(col).

Any permutation that satisfies Constraint 6, π:c→u, can be factored asfollows: u=π[c]=π_(LSB) ^(πi) ^(row)^(=π{0, . . . , K/M−1})[π_(MSB)[C]], where π_(MSB)[] represents asingle permutation over the integer ring {0, . . . , K/M−1} which isapplied down each column of C, and π_(LSB) ^(πi) ^(row)^(=π{0, . . . , K/M−1})[] represents a set of K/M differentpermutations, each defined over the integer ring, {0, . . . M−1}, andeach respectively applied across row π_(MSB)[i_(row)] of C. Let[U]_(K/M×M) denote a K/M×M “implicit permutation matrix” that isimplicitly loaded with the vector u in column-major order. In matrixnotation, U is mathematically related to C according to U=π[C]=π_(LSB)^(πi) ^(row) ^(=π{0, . . . , K/M−1})[π_(MSB)[C]], i.e., by applyingπ_(MSB)[] down each column of C and then separately applying π_(LSB)^(π) ^(row) [] to the π_(MSB)[i_(row)]^(th)

πi_(row) ^(th) row of C. Using these notations, any given pair of rowand column indices, (i_(row),j_(col)), of the matrix C are permuted tothe row and column indices, (π_(MSB)[i_(row)], π_(LSB) ^(πi) ^(row)[j_(col)]), of the matrix U. Constraint 6 ensures that any given row ofelements in the matrix C maps to a row of the same elements, but in anintra-row-permuted ordering, in the matrix U. For example, Constraint 6will require that the entire last row of the matrix 710 whose row indexis i_(row)=4 will permute to row π_(MSB)[i_(row)] in U in such a waythat [π(4), π(9), . . . π(39)] will all be on the same row,π_(MSB)[i_(row)], but with an a scrambled ordering in accordance with anintra-row permutation, π_(LSB) ^(πi) ^(row) []. The M×M interconnectionnetwork 730 is provided to perform each of the needed intra-rowpermutations, π_(LSB) ^(πi) ^(row) ^(=π{0, . . . , K/M−1})[].

The memory system 700 will be connected via the M×M interconnectionnetwork to a set of M processing engines, labeled as Proc(j_(col)),j_(col)=0, . . . , M−1 (not shown). The address generator 705 isconfigured to be able to count in natural order (to access any row ofthe C matrix in parallel), and also in permuted order (to access any rowof the U matrix in parallel). For example, as discussed in furtherdetail below in connection with FIGS. 11-12, during SISO (soft inputsoft output) decoder operations, the actual data elements stored in thememory 710 are LLR (log likelihood ratio) values. In the first half ofthe SISO iteration, each of the processors Proc(j_(col)), j_(col)=0, . .. , M−1, will need to access a respective column of the matrix U. Toallow this to happen without contention, each of the subsets of LLRsstored on each of a sequence of selected rows, {π_(MSB)[i_(row)],i_(row)=0, . . . , K/M−1}, of the U matrix need to be accessed inparallel. In the second half of the SISO iteration, the subsets of LLRsstored on each of a sequence of selected rows of the C matrix will needto be accessed in parallel. By using the hardware arrangement 700, thedata of the C matrix is stored in the memory 710 and there is no need tophysically move the data from the matrix C to the matrix U. Instead, thedata stays in place in the ordering as shown in the memory 710, and thusthe matrix U is called the “implicit permutation matrix.” Addressgenerator(s) 705 and optional address generators 715 and 720, workingwith the M×M interconnection network 730 are used to allow the memory710 to be accessed as both the C matrix and the U matrix.

To understand how the memory 710 can be accessed in accordance with boththe C and U matrices, consider an example involving a QPP interleaver asused in the current 4G LTE turbo code. During the second half of eachSISO iteration, the block 705 acts as a sequential up/down counter thatincrements/decrements the row index, i_(row). During the first half ofeach SISO iteration, the block 705 performs QPP addressing as describedabove in connection with the Sun reference. A high speed M-way parallelhardware embodiment of the address counter 705 can be implemented togenerate M consecutive QPP addresses in parallel. Inside the block 705,are M parallel QPP address generators that are configured to sequencethrough all of the addresses of all elements stored on each column of U.This way, all of the elements, (π_(MSB)[i_(row)],π_(LSB) ^(πi) ^(row)[j_(col)]), for j_(col)=0, . . . , M−1 are generated in each parallelcycle by the address generator 705. Each of these M parallel QPP addressgenerators are respectively initialized, similar to the discussion ofrecursion equations (2) and (3), but instead of each of them beinginitialized to zero, each recursive QPP address generator sub circuit isinitialized with the a respective index that appears in the first row ofthe matrix C. In the small example shown in block 710, these M=8 QPPrecursive parallel address generators/counters inside of the block 705would be respectively initialized with the indices 0, 5, 10, . . . , 35.During forward recursion operations, each such parallel QPP addressgenerator would increment using its respective QPP recursion counterwith a stride of d=Δi=M using the known techniques as described in theSun reference. During backward recursion operations, as also explainedin the Sun reference, these QPP recursion counters would incrementbackwards using a stride of d=Δi=−M. Due to the contention free propertyof QPP interleavers, once the M parallel address generators areinitialized in this way, all of the parallel QPP address counters willbe guaranteed to generate the same row address which can be extractedfrom the MSBs of any or all of the M parallel QPP address generatorsmade up of the M=8 LSBs from these respective M=8 QPP parallel addressrecursion generators/counters are shown one of the outputs of block 705in FIG. 7. The set, {LSBs}_(M), defines the intra-row permutationπ_(LSB) ^(πi) ^(row) [] to be used on the π_(MSB)[i_(row)]^(th)

πi_(row) ^(th) row during the same cycle where the MSB output of theaddress generator 705 generates the row address, π_(MSB)[i_(row)].

In the operation of the system described above, while addressing thematrix C, all the up/down row counter in the block 705 needs to do is toprovide a single row address, i_(row), because once this row is accessedin parallel, data elements C[i_(row),j_(col)], j_(col)=0, . . . , M−1can then be passed directly to the set of processors Proc(j_(col)),j_(col)=0, . . . , M−1 (not shown in FIG. 7) via the M×M interconnectionnetwork 730. However, when the MSBs of each of the set of M parallel QPPaddress generators are used to generate the same sequence of rowaddresses, π_(MSB)[i_(row)], the LSBs of each of the respective one ofthe M parallel-generated QPP addresses will be equal to π_(LSB) ^(πi)^(row) [j_(col)], for j_(col)=0, . . . , M−1. Therefore, when the memoryarray 710 is being accessed in permuted order, elementsC[π_(MSB)[i_(row)], π_(LSB) ^(πi) ^(row) [j_(col)]], for j_(col)=0, . .. , M−1 need to be passed to processors Proc(j_(col)), j_(col)=0, . . ., M−1. To pass each row elements to the correct respective processor,the LSBs of each the M=8 parallel-generated QPP addresses are decodedand used to control the M×M interconnection network 730. A detaileddescription of the low level circuits that can be used to implement suchdecoding known to those skilled in the art and is described in theStuder reference. Based on the decoded {LSBs}_(M), information, the M×Minterconnection network 730 will permute the elements of rowπ_(MSB)[i_(row)] of C so that each respective element,U(i_(row),j_(col))=C[π_(MSB)[i_(row)], π_(LSB) ^(πi) ^(row) [j_(col)]],is sent to its respective target processor, Proc(j_(col)), forj_(col)=0, . . . , M−1.

Next consider an example where the memory system 700 is specificallyused while decoding a CTBC code. To look at a larger example than thatshown in the block 710, let the frame size be K=4096 (2¹²) and let therebe M=2³=8-way vectorization, so that each column of the interleavermatrix 710 has 2¹²⁻³=2⁹=512 bits per column. Assuming that the (8,4)Hamming code of Table 1 is being used as the OBC, there will be 2³=8bits per OBC codeword, and thus each column of C will contain 2⁹⁻³=2⁶=64codewords of the OBC. In the first half of a SISO iteration, each of theM=8 processors will need to access a respective column of the implicitpermutation matrix U. Since the U matrix is never explicitly formed, theaddress generator 705 is preferably configured to generate a set of QPPpermuted row and column addresses using the parallel configuration basedon the Sun reference as described in detail above. During the first halfthe of the CTBC code's SISO decoding operation, IRCC decoding isperformed, so that each of the M=8 processors perform parallel decodingon a separate column of the U matrix in order to decode a respectivelength-K/M=512 subsequence of the CTBC codeword, v. During the secondhalf of the SISO decoder cycle, the address counter 705 counts innatural order, 0, . . . , K/M−1, and the M×M interconnection network 730performs a direct pass through, so that each of the M=8 processors canperform, in parallel, an OBC SISO decoding cycle on a subsequence of 64codewords stored in each column of the matrix C.

Certain permutations like the QPP are already factorizable, in whichcase a set of MSBs extracted from the address generator 705 can be usedto select a row, and the LSBs of each of M parallel-generated QPPaddresses can be decoded and used to control the interconnection network730 to apply the intra-row permutation to the elements the selected row.However, an aspect of the present invention 700 contemplates that anyvalid permutation over the integer ring {0, . . . , K/M−1}, π_(MSB)[],can be used to select rows in the memory 710, whether π_(MSB)[] isvectorizable or not. Then any independent set of intra-row permutations,π_(LSB) ^(πi) ^(row) ^(=π{0, . . . , K/M−1})[] can be applied acrossthe selected rows of C, and this combination will give rise to a validvectorizable permutation, u=π[c]=π_(LSB) ^(πi) ^(row)^(=π{0, . . . , K/M−1})[π_(MSB)[C]]. As can be seen in FIG. 7, one ormore permutation address generators, 705, 715, 720 can be used togenerate alternative sets of {LSBs}_(M). While three sources ofintra-row permutation addresses are shown in FIG. 7, in general anynumber intra-row address generators can be used as needed to form DCIsas discussed below.

As previously discussed, a “random interleaver” can be defined inopposition to a “deterministic interleaver” that uses a mathematicalformula to generate the deterministic interleaver permutation. A randominterleaver is thus often implemented as a table look up or with astate-machine logic circuit whose sequencing logic does not use a fixedmathematical equation but whose state transition logic needs to bespecifically designed for each is frame size. In this context, many ofthe deterministic interleavers defined herein can have some randomcomponents to them that rely on state transitions and state dependentlogic that are different for each frame size. A design objective is todesign and select DCI solutions that minimize the amount of hardwarethat needs to be specifically designed for each is frame size.

Referring now to FIG. 8, a design method 800 is provided to designvarious types of deterministic constrained interleavers, π_(DCI):c→u,that can be used in block 311 of FIG. 3 and in related CTBC decoders.The design method 800 makes use of the vectorization matrix C and theimplicit permutation matrix U as a mathematical framework to designvectorizable CI-3, CI-4 and mixed CI-3/CI-4 interleavers. The method 800is performed off line and is executed separately for each frame size, K,supported by a given system, and a different DCI is designed for eachsupported frame size. Also, the method 800 is carried out within anouter loop that searches over various sets of deterministicinterleavers. For example, when the deterministic interleaver used inthe method 800 is a QPP interleaver, various combinations of the QPPparameters f₁ and f₂ of equation (1) can be used to generate differentstarting points for the method 800. The method 800 can then be executedfor each set of parameters, f₁ and f₂. Some sets of parameters willgenerate DCI solutions and others may not, and in the end, a best set ofparameters will be identified for use at a given frame size. At runtime, the selected set of parameters, f₁ and f₂ will be used along withany of the alternative intra-row permutations determined by the method800 as discussed below. Other permutation functions beside QPPpermutation functions can alternatively be evaluated and selected toperform the MSB row-selection permutation, π_(MSB)[]. Any deterministicinterleaver can be used to generate a permuted sequence of rowaddresses, and this row address interleaver need not even bevectorizable. In such cases the sets {LSBs}_(M), typically come fromblocks 715 and/or 720 and and/or a separate set of LSBs generators inthe block 705.

The method 800 is first described in the context of designing CI-4 DCIs.As discussed in further detail below, the method 800 can also beconfigured to also design CI-3 and mixed CI-3/CI-4 DCIs. In thedescriptions of certain preferred embodiments below, it is assumed thatthe block 705 is a QPP interleaver that generates both a sequence ofpermuted row addresses from the MSBs and also generates a set of Mpermuted column addresses using the LSBs from a set of M=8 QPP recursivepermutation address generators as discussed above in connection withFIG. 7.

Action 802 is similar to action 602 as discussed above, and when a CI-4DCI is being designed, the action 802 preferably generates a completeμ-set, {(nj+t)_(μ): j=0, . . . , ρ−1}. However, in the method 800,elements of the μ-set are selected to be a subset of the rows in thevectorization matrix C. For example, if M=8, K=4096, K/M=512 and the(8,4) Hamming code of Table 1 is used, so that n=8, ρ=512, and μ=4, thismeans that half of the coded bits of c can be included in the μ-set. Aselected μ-subset consisting of μ=4 bits will come from each codeword.Because the codewords of the OBC are loaded into the matrix C in columnmajor order, the indices of all of the coded bits on each row of C willbe congruent to the same value of t modulo n, given by π[i_(row)] MODn=t. This implies that half of the rows of the matrix C can be includein the μ-set. Hence in a typical embodiment, the action 802 randomlyselects a sequence of ρ μ-subsets, until, in this example, a selectedset of K/M/2=256 rows has been placed, and the placement counter is setto PLACED=μ*ρ=4*512=2048. In some embodiments, in order to provideadditional flexibility in mapping, the action 802 creates a partialμ-set in order to loosen the requirements later while placing coded bitssubject to the constraints. Whether a complete or incomplete μ-set isselected, the action 805 identifies a corresponding set of permutationlocations, {π(nj+t)_(μ): j=0, . . . , ρ−1}, for the μ-set selected inthe action 802. In the example of FIG. 7, a QPP interleaver can be usedin the block 705 in order to generate the permutation u=π[c]=π_(LSB)^((i) ^(row) ^(=0 . . . ,K/M−1))[π_(MSB)[C]], by applying π_(MSB)[] toall the 256 rows in the selected μ-set, {i_(row)}_(μ256), and the LSBsfor each respective intra-row permutation can be the QPP permutation({LSBs}_(M=8)) generated in the block 705. In this type of embodiment,the intra-row permutations generated in by the address generator 705 areselected when possible to be used as the preferred default rowpermutations. With this placement of bits, none of the CI-4 constraintswill have been violated because no codeword of the OBC will havecompleted.

Control next passes to action 810 where a remaining row of C, i.e., arow that has not yet been placed, is selected. In preferred embodiments,the selection 810 is generated using a pseudo random number generatorthat generates a row index into a vector that includes all the indicesof the rows that have not been placed yet, although other selectioncriteria can be used. The currently selected row can be selected byfirst randomly selecting a value for the variable i_(row) which isoutside of the original μ-set, e.g., the 256-element μ-set denoted{i_(row)}₂₅₆. The selected row 810 is then given by π_(MSB) [i_(row)].

Control next passes to an action 815 which performs an analysisfunction. The action 815 views the currently selected row as a mappinggroup as discussed in connection with the actions 615 and 620 above. Allof the restricted zones for each coded bit position on the currentlyselected row, π_(MSB)[i_(row)], of the matrix C are preferably evaluatedin the action 815. In a preferred embodiment, the action 815 generates aset of positions vectors for each of the coded bit positions on theπ_(MSB)[i_(row)]^(th) row of the C matrix and passes these positionsvectors to Algorithm 1 in order to identify their respective restrictedzones. As discussed earlier, different target MHDs may be used inAlgorithm 1 depending on LengthP of each positions vector. The action815 preferably also maintains a data structure that records all of thepositions vectors and restricted zones for all the elements that havebeen placed. As described below, the recorded information regarding thealready placed positions vectors and restricted zones can be useful if aroll-back is required later, or for other purposes as discussed below inconnection with FIG. 10.

Control next passes to an action 820 which starts by identifying apreferred default candidate intra-row permutation, π_(LSB) ^(i) ^(row)[]. When M=8, a set of M=8 candidate permutation positions,π_(candidate-0), . . . , π_(candidate-7) can be identified for placingthe M=8 coded bits of the π_(MSB)[i_(row)]^(th) row of the C matrix. Forexample, the preferred default candidate intra-row permutation can bethe {LSBs}_(M=8) Output of the Block 705 in Embodiments where a QPPinterleaver is used in the block 705 of FIG. 7 as described above. Ifthe default intra-row permutation is able to place the coded bitpositions into the candidate locations, π_(candidate-0), . . . ,π_(candidate-7) on the π_(MSB)[i_(row)]^(th) without placing any codedbit positions into any of their respective restricted zones, then thedefault intra-row permutation is preferably selected. In this case theaction 820 places the bits from the currently selected row,π_(MSB)[i_(row)], using the set of QPP-generated LSBs, {LSBs}_(M=8), asgenerated by the block 705 as discussed above in connection with FIG. 7.

As shown in FIG. 7, blocks 705, 715 and optionally 720 and more similarπ_(LSB) ^(i) ^(row) [] address generator blocks may be used. Firstconsider an embodiment where just the blocks 705 and 715 are used togenerate two different sequences of intra-row permutations, andπ_(LSB-Alt) ^(πi) ^(row) []. In such an embodiment, if the preferreddefault candidate intra-row permutation, π_(LSB) ^(πi) ^(row) [i_(row)],cannot place all of the coded bit positions in the π_(MSB)[i_(row)]^(th)row of C into non-restricted zones, then an alternative intra-rowpermutation π_(LSB-Alt) ^(πi) ^(row) [i_(row)] is identified that isable to place all of the coded bit positions in theπ_(MSB)[i_(row)]^(th) row of C into non-restricted zones. In suchembodiments, the multiplexer 725 is used to select π_(LSB-Alt) ^(πi)^(row) [i_(row)] when the corresponding row address, π_(MSB)[i_(row)],is generated that required the alternative intra-row permutation.Different such DCIs will need be designed and specified for use witheach frame size, K. If, during the action 820, no such permutation canbe identified, then one or more roll-backs can be attempted and/or themethod 1000 as discussed below in connection with FIG. 10 can be used.If all of the roll-backs fail, then a different permutation, π_(MSB)[],can be selected, for example, by checking another pair of the values f₁and f₂ in embodiments where π_(MSB)[] is a QPP permutation. Then themethod 800 is started anew with this new permutation, π_(MSB)[]. Insome embodiments a desired set of default intra-row permutation is notused and only one specifically designed set of permutations, π_(LSB-Alt)^(i) ^(row) ^(=0, . . . ,K/M−1)[] is identified. In such embodimentsthe block 705 only generates π_(MSB)[] and the block 715 is used togenerate the row-permutation sequence of LSBs, {LSBs}_(M=8), and block720 is not used. For example, coordinated instantiations of the method800 can be run at different frame sizes and an set of alternativeintra-row permutation sequences can be identified that provide DCIsolutions at a plurality of different frame sizes.

In other kinds of embodiments, such as illustrated in FIG. 7, one ormore intra-row permutation blocks like blocks 715 and 720, can each beconfigured like block 705 to count in accordance with a respectivedifferent permutation π_(MSB-Alt)[i_(row)], and to generate a respectivedifferent sequence of intra-row permutations, {LSBs}_(M-Alt). In suchembodiments, two, three or more blocks 715 and 720 operate similar tothe block 705 to count in different permutation orderings so as toprovide different alternative intra-row permutations, for {LSBs}_(M=8),and the multiplexer 725 selects one of the blocks 715, 720, etc., toprovide a selected set {LSBs}_(M=8) to be coupled to the M×Minterconnection network, 730 for use in permuting theπ_(MSB)[i_(row)]^(th) row. In such embodiments, the block 705 generatesthe row address that is coupled to the memory 710, and the actions 815and 820 are used to determine the selection control to the multiplexer725 and to optimize the specific permutation rules used in blocks 715and 720 and any additional such blocks, if used, in any givenembodiment.

In all such embodiments of the method 800, control next passes to action825 where the variable PLACED is incremented by M If no valid placementcould be found in the action 820, a flag is preferably set in the action820 so that the action 825 will not increment the variable PLACED.Control next passes to action 830. If the variable PLACED has beenincremented by M, and the value of PLACED is still less than K, thencontrol passes from the action 830 back to the action 810. If an errorcondition has been marked in the action 820, then action 830 performs aroll-back and increments a roll-back counter. When a roll-back iscarried out, the value of i_(row) and thus π_(MSB)[i_(row)] is rolledback to a previous value, and the value of PLACED is set to a lowervalue indicative of the roll-back point. If a certain number ofroll-back attempts fail, a new deterministic interleaver can be selectedfor use as π_(MSB)[]. For example, if the deterministic interleaver isa QPP interleaver, the parameters f₁ and f₂ are adjusted and the method800 started over again using this new π_(MSB)[] permutation at theaction 802. Also, If a certain number of roll-back attempts fail, themethod 1000 as discussed below can be executed, especially for caseswhere the value of PLACED is close to K=μ*ρ.

To understand how roll-backs are performed, consider the above examplewhere there are a total of 512 rows to be placed, and a “non placeablerow” error condition is flagged when attempting to place rowπ_(MSB)[i_(row)], which corresponds to the 507^(th) row in therow-placement sequence. The probability of having a “non placeable row”error condition increases toward the end of the method 800 when themajority of the rows have already been placed. There are variousapproaches that can be performed to perform a roll-back. One approach itto not place the current non-placeable row, flag the current row as nothaving been placed, but to then increment PLACED, and continue to thenext row, and continue doing this until PLACED reaches its final value.Assume that this is attempted, and by completion of the method 800, allof the rows could be placed except for rows, π_(MSB)[i_(row)], fori_(row)ε{14, 210, 507}. To perform the roll-back, the stored positionsvectors and restricted zones are analyzed and the codewords involved inthe positions vectors are identified. An analysis is performed todetermine which already placed rows contain codeword positions thatcaused the difficulty in the placement of the codeword positions in therows that could not be placed. The roll-back is then preferablyperformed by causing certain earlier-placed rows to be placed in such away as to alleviate all of the placement problems in the problematicrows. This can even go as far as causing certain alternativepermutations to be applied in the μ-set. Once the changes are made tothe placement of a subset of already placed rows, the method 800 isrestarted at the point after the earliest row that was changed (or afterthe μ-set if changes were made in the way any of the rows of the μ-set).The method 800 is then allowed to run to completion or until anotherroll-back is needed. This process continues until the method 800 finds asolution or until a roll-back counter meets a threshold. If theroll-back threshold is met, a new deterministic interleaver is used inthe block 705, and the method 800 is started over with this newdeterministic interleaver. For example, if the deterministic interleaverused in block 705 is a QPP interleaver, the parameters f₁ and f₂ areadjusted and the method 800 started over again using this new π_(MSB)[]permutation at the action 802. Also, the method 1000 as discussed belowin connection with FIG. 10 can be invoked to attempt to find a DCIsolution for the failed run of the method 800.

Like the method 600, the method 800 can be configured to perform an“analysis run” that does not place coded bit positions from c to u inactions 805 and 820. In analysis runs, the coded bit positions have allalready been placed, so the method 800 is used instead to identifyrestricted zones for any specified MHD, e.g., MHD>d_(t). The output ofan analysis run includes a set of positions vectors (and theirrespective lengths) that correspond to low weight CTBC coded sequenceswhose weights are d_(t)<d≦d_(f), where d_(f) corresponds to the highestweight of sequences that need to be identified in the analysis run.Other information such as restricted zones or other statisticalinformation such as the category of each identified low weight positionsvectors and counts of positions vectors in each category can also beprovided.

The method 800 can also be configured to perform a CI-3 interleaverdesign, or a mixed CI-3/CI-4 interleaver design. First considerconfiguring the method 800 to perform a CI-3 interleaver design. Tostart, the action 802 is configured by setting the parameter μ to μ=1.This causes action 805 to place just one row without constraints.Control next passes to action 810 which can select a next remaining row,i_(row), for example, by using a random number generator, and can thenselect the next row of C to be placed to be π_(MSB)[i_(row)]. In CI-3design embodiments, action 815 is configured to identify the restrictedzones for all coded bit positions on the selected row due to Constraints1-4. The same CI-3 checks (Constraints 1-4) are used as discussed inconnection with action 615 of the method 600 when configured for CI-3based designs as discussed in connection with FIG. 6 above. Theplacement of a row using CI-3 constraints in actions 815-820 is similarto the placement of a mapping group using CI-3 constraints as discussedin connection with actions 615-620 in connection with FIG. 6. In theaction 820, each of the coded bit positions on the selectedπ_(MSB)[]^(th) row of C can be placed similarly to a mapping group inthe action 620 or to similarly to the embodiments described above inconnection with the action 820 as used to design a CI-4 DCI, exceptusing the restricted zones as determined using Constraints-1-4 in theaction 815.

To configure the method 800 to design a mixed CI-3/CI-4 DCI, the method800 is instantiated twice, with one instantiation configured to design aCI-3 interleaver as described above (“the CI-3 instantiation,”) and theother instantiation configured to design a CI-4 interleaver as alsodescribed above (“the CI-4 instantiation.”) These two instantiationswill work on the same mixed CI-3/CI-4 DCI design problem together andcommunicate and synchronize each other as described above in connectionwith the method 600 that used a CI-3 instantiation of the method 600 incommunication and in synchronization with the CI-4 instantiation of themethod 600. The difference is that mixed CI-3/CI-4 vectorizable DCI usesa CI-3 instantiation of the method 800 in communication and insynchronization with the CI-4 instantiation of the method 800.

The method 800 can also be modified to operate with a plurality ofdifferent target MHDs. This type of operation is similar to the abovedescribed embodiments of the method 600, except the multiple target MHDsare applied in the action 820 instead of the action 620 as describedabove.

In an alternative embodiment of the method 800, the number of elementsper row is set to M=1 so that there is only one column which has K/M=Kcoded bits in a single column (i.e., and entire frame). Length K DCIsthat are not vectorizable have the disadvantage that they are notvectorizable, but have increased interleaver gain as compared to avectorizable DCI. In such embodiments, the Constraint 6 will not beenforced. A starting deterministic interleaver is provided, π_(MSB)[],that has a frame size of K. The actions 802 and 805 is operate likeactions 602 and 605 to select and place a set, and action 810 behavessimilarly to action 610 to preferably select a single coded bit to beplaced. In such embodiments action 820 places each coded bit inaccordance with the deterministic interleaver, π_(MSB)[]. A full datastructure of related information such as positions vectors, restrictedzones, and constraint violations in placing bits in accordance withπ_(MSB)[] are recorded. The final result is an analysis of π_(MSB)[]to determine how close it is to a length K DCI. As discussed inconnection with FIG. 9 and FIG. 10, additional hardware can be providedand actions taken to enforce the constraints at run time.

FIG. 9 is a block diagram of an alternative embodiment of adeterministic constrained interleaver DCI 900. Like a QPP interleaver, acontrol input to the deterministic constrained interleaver 900 is a setof indices, i=0, . . . K−1, or a set of recursion start-up parameters.The indices and/or the interleaved-addresses are typically used to indexa vector containing data elements, where the data elements can be itemslike information bits, coded bits, branch metrics, extrinsic LLR(log-likelihood ratio) numbers, and the like. In applications where thedeterministic constrained interleaver (DCI) 900 is used inencoder/decoder systems that do not use CTBC codes, the data elementscould correspond to any other type of data element that is indexed bythe set of indices and interleaved addresses that are generated by theDCI 900.

A deterministic interleaver 905 generates a set of permuted indicesusing a deterministic formula-based calculation to generate thedeterministic interleaver's address sequences under state machine orprogram control. The output of the deterministic interleaver 905 iscoupled to a local constraint enforcer permutation block 910. As thename implies, the purpose of the local constraint enforcer permutationblock 910 is to perform a local post-permutation to transform thedeterministic interleaver's output to a valid constrained-interleaverpermutation function. The local constraint enforcer permutation 910takes as input the deterministically permuted sequence of indices andapplies a predetermined set of correction permutations to ensure thatthe resulting sequence, u, meets a set of interleaver constraints suchas any one or more of interleaver Constraints 1-6 as discussed above.For example, the local constraint enforcer permutation can apply apredetermined set of swaps or sub-permutations to convert the output ofthe deterministic interleaver 905 into a valid deterministic constrainedinterleaver permutation, 900.

Similar concepts as described in connection with FIG. 6 and FIG. 8 canbe used to determine the local constraint enforcer permutation 910. Inpractice the deterministic interleaver 905 is selected to be as close aspossible to a DCI, and the local constraint enforcer permutation 910 isselected to perform a predetermined set of swaps or other type ofreordering. The design objective is to minimize the hardware complexityneeded to implement the local constraint enforcer permutation 910. Asdiscussed in further detail below, the method 1000 can also be used tomake improvements upon random (i.e., non-deterministic) constrainedinterleavers where the method 600 was not able to easily determine aCI-3, CI-4 or a mixed CI-3/CI-4 solution. In such cases, the swappermutations identified to be performed by the local constraintsenforcement permutation 910 are incorporated directly into the randomconstrained interleaver. In this sense the method 1000 acts as a secondlayer of a roll-back action for use when the method 600 neededadditional help in finding a CI-3, CI-4, or a mixed CI-3/CI-4 solution.

Referring now to FIG. 10, a method 1000 is provided to design the localconstraint enforcer permutation 910. The design method 1000 can be usedto design CI-3, CI-4 or mixed CI-3/CI-4 DCIs that implement all ofConstraints 1-5. The method 1000 can also be used to design contentionfree DCIs that implement Constraint 6. As discussed below, localconstraint enforcer permutation 910 can also be used to help the method600 find random constrained interleavers in cases where a solution isdifficult to find. The method 1000 is first described herein in thecontext of designing a CI-4 interleaver. The same types of modificationsas discussed above in connection with FIG. 6 and FIG. 8 can be used todesign the permutation enforcement permutation 910 to design CI-3 andmixed CI-3/CI-4 DCIs.

The method 1000 begins with an action 1002 that selects a deterministicinterleaver, π_(D)[], for use in the block 905 of FIG. 9. As discussedin connection with FIG. 8, the deterministic interleaver selected inaction 1002 could be any deterministic interleaver. For example,different QPP interleavers could be selected by adjusting the parametersf₁ and f₂ of equation (1).

The deterministic interleaver 905, π_(D)[], that is identified inaction 1002 is then processed by action 1005 using an embodiment of themethod 800. For example, any of the embodiments of the method 800 asdescribed in connection with FIG. 8, or other similar variations thereofcould be used. The method 800 can be configured as a design run thatattempts to design a DCI, or as an analysis used run to measure theamount and extent of constraint violations that the deterministicinterleaver π_(D)[] selected in action 1002. As shown in FIG. 10, thereis a loop between actions 1005 and 1002. This loop is used to select agood deterministic interleaver candidate for use with the method 1000that is close as close to a DCI as possible. Closeness to a DCI ismeasured by recording all of the low weight CTBC vectors that exist inthe deterministic interleaver 905 below a specified target MHD, d_(t).

Alternatively, the method 1000 can be used similar to a roll-back forcases where the method 800 was not able to find a deterministicconstrained interleaver permutation, π_(DCI)[] to meet all thespecified constraints. In such cases, the method 1000 can be viewed asan outer control loop that calls the method 800 to design a DCI from theaction 1005. If the method 800 is able to find a valid DCI, then themethod 1000 can exit at action 1005. If the method 800 is not able tofind a DCI to meet a specified set of Constraints 1-6, then the analysisinformation provided by method 800 can be used to identify one or moreinterleavers that are close to a DCI but still have one or moreidentified constraint violations. Preferably a complete record ofrelevant of the information from the one or attempted design runs of themethod 800 that provided the one or more best/closest approximations toa DCI are recorded in the action 1005. For example, the μ-sets, thelists of positions vectors used when placing each position, (nj+t), therespective restricted zones identified when placing each bit (nj+t), andthe orderings used to make the rest of the placements after the μ-set asidentified by each pass through action 810 would be identified for eachcandidate deterministic permutation π_(D)[] that is identified to beclose to a deterministic constrained interleaver permutation π_(DCI)[].The data structure will have preferably recorded a set of one or morecoded bit positions, (nj+t), that could not be placed in such a way asto meet the target MHD during the previous run of the method 800. Thisway, the complete state and history of the runs of the method 800 thatresulted in each candidate π_(D)[] could be made available to themethod 1000. Depending on the embodiment of the method 1000, only asubset of all the recorded information described above may need to berecorded for use by the rest of the method 1000. In embodiments wherethe action calls the method 800 to provide only an analysis run, asimilar set of information would be provided by the analysis run of themethod 800.

Control next passes to an action 1010 that places a μ-set similar to theaction 805 as described above. The same placement as used in the action805 when the method 800 was called from the action 1005 is preferablyused. The previous run of the method 800 could have been an analysis runor a failed design run as discussed above. Control next passes to anaction 1015 which selects a next coded bit position or a mapping groupto be placed. This action can be performed similar to any of theembodiments of the action 810 as described above. In preferredembodiments the action 1010 follows the stored ordering that wasgenerated by looping through the action 810 by the method 800 when itwas called from the action 1005. When this ordering is used, theinformation recorded in the previously described data structure will beperfectly synchronized with the current run of the method 1000. Inalternative embodiments, the mapping group is selected based upon, forexample, a window of positions in the vector u where the stateinformation data structure provided by the previous run of the method800 indicates that constraint violations exist that need reconciliation.

Control next passes to an action 1020 identifies one or more respectivelocal swap lists associated with each of the one or more current bitpositions in the current mapping group. For example, the action 1020analyzes each bit position (nj+t) of the current mapping group todetermined whether the deterministic interleaver 905's permutationlocation π_(D)[(nj+t)], is in a restricted zone. If π_(D)[(nj+t)] doesnot correspond to a location in any restricted zone of (nj+t), then arespective local swap list, L_(swap)(π_(D)[(nj+t)]), is left empty. Insuch a case, if there is only one element in the mapping group, thencontrol then passes to actions 1025 and 1030 where the variable PLACEDis incremented and control is looped back to the action 1015 to selectthe next placed element to be analyzed. If π_(D)[(nj+t)] is in anidentified restricted zone of (nj+t), then the local swap listL_(swap)(π_(D)[(nj+t)]) will need to be built. The local swap list willcontain a list of positions that are both local to the mapped positionπ_(D)[(nj+t)] and are outside any restricted zones of (nj+t).

The concept of “local” is relative to the underlying hardware on whichthe π_(DCI)[] 900 is to be implemented. For example, if the interleaverπ_(DCI)[] 900 is not being designed to be a vectorizable interleaver, alocal set of candidate swap locations can be defined as a window,π_(D)[(nj+t)]±w_(d), where w_(d) corresponds to a window distance and isused to define a window around the position π_(D)[(nj+t)] in u. If theinterleaver π_(DCI)[] 900 is being designed to be a vectorizableinterleaver, then “local” typically refers to a two-dimensional windowarea, given by U(π_(MSB)[i_(row)]±w_(d-row), π_(LSB)[j_(col)]±w_(d-col))Typically the smaller the value of w_(d), w_(d-row), and/or w_(d-col),the lower complexity that will be required to implement the localconstraints enforcement permutation 910. The window size used in a swapzone is preferably made as small as possible and the minimum possiblewindow size is dependent on the distance to a nearest swappable positionin u as discussed below. In many cases the window need not be centeredon the position π_(D)[(nj+t)], but as discussed below, the window edgeswill be determined by the edges of certain relevant restricted zones.When the current mapping group contains more than a single coded bitposition, the minimum possible window size for use with a swap list canalso be influenced by the other swap lists of other elements in thecurrent mapping group.

Without loss of generality, assume for now that the DCI 900 beingdesigned is not required to be contention free, so the simplerone-dimensional window, π_(D)[(nj+t)]±w_(d) in u is in use. Also assumethat the current mapping group only has one element, (nj+t), and thatπ_(D)[(nj+t)] has been placed into a restricted zone of (nj+t). In thisexample the local swap list, L_(swap)(π_(D)[(nj+t)]), will need to bebuilt. The local swap list is built by starting with the smallest windowsize possible. The smallest window size possible is influenced by therestricted zone in which the coded bit position, (nj+t), has beenplaced, i.e., the restricted zone of (nj+t) around π_(D)[(nj+t)]. Inthis example, suppose that the restricted zone into which (nj+t) hasbeen placed can be defined as the range [π_(D)[(nj+t)]−rz₁,π_(D)[(nj+t)]+rz₂], where rz₁ and rz₂ are parameters that define therestricted zone edges relative to the placed position, π_(D)[(nj+t)].

Continuing with this example, and focusing on the increasing directionin u, there will be a neighboring bit position atπ_(D)[(nj₂+t)]=π_(D)[(nj+t)]+rz₂+1. The action 1020 will next check todetermine whether π_(D)[(nj+t)] is in a restricted zone of bit position(nj₂+t). If π_(D)[(nj+t)] is not in a restricted zone of bit position(nj₂+t), then π_(D)[(nj₂+t)] can be added to the swap listL_(swap)(π_(D)[(nj+t)]). This is because, if the positions π_(D)[(nj+t)]and π_(D)[(nj₂+t)] are swapped in u, then after the swap, neither ofπ_(D)[(nj+t)] will have moved out of its restricted zone andπ_(D)[(nj₂+t)] will still be outside of its restricted zones. The localswap list L_(swap)(π_(D)[(nj+t]) can thus be built in this way in boththe increasing and decreasing directions in u (or 2-dimensionalindexing-area in U). The entire swap list, L_(swap)(π_(D)[(nj+t)]), neednot all be built at once, but can be expanded as needed to include moreelements. The idea is to start with the closest elements in u and togrow the list as needed. If another restricted zone of (nj+t) isencountered while expanding outward in any direction from the centerposition, π_(D)[(nj+t)], those points are skipped over to one pointbeyond the distant edge of the newly encountered restricted zone.

Once one of more elements have been added to L_(swap)(π_(D)[(nj+t)]) inthe action 1020, control next passes to an action 1025. Continuing withthe simple example where there is only one element, (nj+t), in themapping group, the action 1025 will typically start with the closestelement of L_(swap)(π_(D)[(nj+t]) and analyze whether this swap is avalid swap. A swap is said to be valid if it swaps π_(D)[(nj+t)] withπ_(D)[(nj₂+t)] so as to eliminate the constraint violation in (nj+t)without introducing any new constraint violation associated with any(one or more) third coded bit position(s), π_(D)[(nj₃+t)]. To ensure theswap is valid, a check is first made by scanning each coded bit positionin u in the vicinities of π_(D)[(nj+t)]±w_(d) and π_(D)[(nj₂+t)]±w_(d)and identifying any placed coded bit position in u, π_(D)[(nj₃+t], thatis associated with a respective coded bit position (nj₃+t) whoseposition vectors contain any coded bits from completed codewordsassociated with the codeword positions c_(j) and/or c_(j2). If any suchcoded bit positions π_(D)[(nj₃+t)] are found, then a further check ismade to determine whether the proposed swap would cause a constraintviolation associated coded bit position (nj₃+t) to occur. If no suchplacements π_(D)[(nj₃+t)] in the local vicinity is found, the swap isdetermined to be valid, and the swap can be made or annotated in thelist to be a valid potential swap for later use. If the swap is notvalid, then additional positions in the local swap list can be checked,or control can pass back to the action 1020 to identify more elements toadd to the swap list and then action 1025 is repeated looping in thisway until at least one valid swap is found. Once one or more valid swapsare found, the counter PLACED is incremented and control passes to anaction 1030.

As discussed above, in some cases a mapping group with Δ>1 element isselected in the action 1015. In such cases, the above process is carriedout, but by additionally observing the interactions between makingmultiple swaps. For example, if four elements are in the mapping group,it could turn out that several different valid swaps could have beenmade, but a particular valid swap caused a problem later. Hencecomputer-chess logic (“look ahead logic”) is used to analyze a set ofpotential valid swaps (“moves”) several moves into the future. Suchadded logic of looking into a trellis of paths containing several movesinto the future can be used to find a set of potential valid swaps thatavoid having an earlier swap cause a problem for a later swap. In fact,this type of optimized forward looking trellis logic can be used with amapping group that includes all of the bit positions that haveconstraint violations.

At times, an invalid swap may purposely be made. An invalid swap is madein order to be able to chain swaps. Chained swaps are used when thedistance of the swap is too large for the underlying hardware, so thatan actual swap is implemented as two sub-swaps, selected such that afterthe two sub-swaps there will be no constraint violations.

The method 1000 can also be used in conjunction with the method 600 thatdesigns a CI-3 or CI-4 or mixed CI-3/CI-4 random constrainedinterleavers. If a random interleaver is being designed as per themethod 600, then the method 1000 can have its action 1005 call themethod 600 instead of the method 800. The method 1000 runs similarly asdescribed above, but all of the swaps can be carried out off line andused to correct the random interleaver's permutation function so thatall the constraints are enforced. In such embodiments, no separateconstraint enforcer permutation 905 is needed because it is incorporateddirectly into the random interleaver's permutation function.

FIG. 11 is a block diagram illustrating an embodiment of areceiver/decoder structure 1100 in accordance with the presentinvention. The receive metrics calculator 1105 (and block 1205 of FIG.12) is generally preceded by a receiver front end. The receiver frontend, for example, receives and demodulates a signal such as an OFDMsignal or an optically modulated BPSK or QAM signal. The receiver frontend portion of the block 1105 could generally be embodied to implementany type of known signal demodulator/demapper, or using any of thesignal mapping and rate matching techniques disclosed hereinbelow inconnection with FIGS. 15-24. The receiver front end portion of the block1105 is typically implemented in a separate chip or subsystem ascompared to the chip that performs SISO decoding. In some embodimentsthe signal metrics calculation related operations performed by blocks1105 and 1205 can also be performed by a separate chip than the chipthat implements the rest of the decoder 1100 and 1200.

The receive metrics calculator 1105 calculates a set of input signalmetrics. When optional rate matching is in use, in accordance thereceived signal metrics calculator 1105 inserts dummy signal metrics toaccount for the bits that have been deleted due to the rate matchingoperation. Typically the signal metrics that are re-inserted based uponthe puncture pattern generator 626 are set to zero, although othervalues could optionally be used. The receive metrics calculator 1105couples these inverse rate matched receive signal metrics to a gamma andbranch metrics initialization unit 1116.

The gamma metrics initialization unit 1116 is configured to initializethe gamma metrics, typically by filling a gamma memory using thecalculated received signal metrics coupled from the receive metricscalculator 1105. The gamma memory is coupled to (or built into as anintegral part of) an inner code trellis SISO half iteration block 1117.The inner code trellis SISO half iteration block 1117 generally uses theinitial gamma values to perform forward and backward state metricsrecursions used to support trellis decoding operations used in SISOdecoding. After the first iteration, during each inner code trellis SISOhalf iteration 1117, the gamma values are updated and then the forwardand backward state recursions (forward alpha and backward betarecursions) are carried out to update the alphas and the betas in block1117. To do these updates, a set of a-priori extrinsic LLR values areread from a 2D memory array, 1160. An “a priori extrinsic LLR value”refers to an extrinsic LLR value before an update occurs and an “aposteriori extrinsic LLR value” refers to an extrinsic LLR value afteran update occurs. Hence depending on exactly where the SISO iterationthe SISO decoder 1100 is processing and from which point in the SISOdecoder algorithms one is looking, a given extrinsic LLR in the 2Dmemory 1160 keeps switching from being an a-priori extrinsic LLR valueto an a-posteriori extrinsic LLR value, and back to an a-prioriextrinsic LLR value and so on.

The order in which the a-priori extrinsic LLR values read into andprocessed by the block 1117 is determined by the L=1 deterministicconstrained interleaver (DCI) or random constrained interleaver (RCI)address generator 1161. The address generator 1161 makes sure thea-priori extrinsic LLR values are sent to block 1117 in L=1 DCI or RCIinterleaved order. After the inner code trellis SISO half iteration iscomplete, a set of updated (a posteriori)extrinsic LLR values arewritten back into the 2D memory array 1160 using the same interleavedordering as discussed above, i.e., ordering determined by the DCI or RCIordering used in the address generator 1161. The 2D memory 1160 can beviewed as holding the U matrix as described above and can be stored inthe physical two-dimensional memory array memory 710 as discussed inconnection with FIG. 7.

It can be noted that the 2D memory array block 1160 appears twice inFIG. 11. This is the same memory array 1160, but in the first half ofthe SISO iteration an interleaved-address ordering is used to access Umatrix in the 2D memory, and in the second half of the SISO iteration, anatural-address ordering is used to access C matrix in the 2D memory.That is, once all the updated extrinsic LLR values have been returned tothe 2D memory array by the inner code trellis SISO half iterationcalculation unit 1117, i.e., when the first half iteration hascompleted, the second half of the SISO iteration begins. A natural orderaddress generator 1162 is typically implemented in hardware as analternative address sequence mode in the same module as the addressgenerator 1161. The address generator 1161 is preferably configured toswitch to the natural ordering mode 1162, i.e., to count as a simplenatural order binary counter. The binary counter type natural-orderaddress generator 1162 is coupled to the address bus of the 2D memoryarray 1160. The a-priori extrinsic LLRs are thus read out to the outerblock code SISO half iteration block 1126. The OBC half iteration blockperforms an iteration of block code soft decoding. In some applicationsother types of codes like LDPC codes couple optionally be decoded in theblock 1126. In general, any soft block decoding type algorithm and anytype of block code can be soft decoded in the block 1126. After the eacha-priori extrinsic LLR is updated by the block code decoding operationin block 1126, the a-posteriori extrinsic LLR or a parallel set ofextrinsic LLRs are returned to the 2D memory array 1160.

As is common practice, a stopping criterion is used to stop iterations.Although not shown, the stopping criterion may be implemented, forexample, in block 1126 to indicate when the total LLRs have converged.To do this one or more total a-posteriori LLR is checked forconvergence. If the convergence criterion is not met, the a-prioriextrinsic LLR received from the memory 1160 is subtracted from thistotal a-posteriori LLR to produce the a-posteriori extrinsic LLR that iswritten back to the 2D memory array 1160 so that SISO iterations cancontinue. In this exemplary embodiment, if the convergence criterion ismet, a control signal is generated to the 2D memory array 1160's controllogic, and the block 1117 writes the total LLRs into the into the memoryarray 1160 and the control logic of the memory array 1160 causes theconverged data values to be output from the system SISO decoder 1100.Alternatively, a fixed number of iterations may be used as the stoppingcriterion in the above description.

The memory architecture 700 can be used to support the memory accessesneeded to support CTBC code SISO decoding. A discussion of the operationmemory system 700 is provided in connection with FIG. 12 where all thedescription of the 2D Array LLR RAM largely comports with the memoryarray 710 in FIG. 7. Thus the discussion of the operation of FIG. 12applies as well to the memory system of FIG. 7, and items mentionedabout the memory system 700 optionally and preferably apply to the SISOdecoder FIG. 12.

Referring now to FIG. 12, a receiver/decoder 1200 is provided for signalreception and real time SISO decoding of CTBC codes in accordance withan aspect of the present invention. The architecture 1200 is designed toimplement a particular class of embodiments of the inventive CTBC codeSISO decoder algorithm as described in connection with FIG. 11 andpreferably using the memory and interleaving architecture as describedin connection with FIG. 7. The architecture 1200 could be embodied inmany different ways, and for many different types of applicationsranging from 4G/5G LTE, WiFi, satellite communications, OTN, magneticrecording, optical disk channels, and the like.

Before describing FIG. 12 in detail, some observations are maderegarding the use of the architecture in decoding a CTBC code that istargeted for use in 4G/5G LTE type applications. The CTBC decodingembodiment is compared to the highly optimized implementation of the 4GLTE CTC decoder of the Studer reference. To begin, first consider onemore reference: J. Li, K. R. Narayanan, and C. Georghiades, “Productaccumulate codes: a class of codes with near-capacity performance andlow decoding complexity,” IEEE transactions on Information theory,” pp.31-46, vol. 50, No. 1, January 2004 (“the Li reference” herein). Page 36of this article the authors prove that when the (known to those of skillin the art) Min-sum decoding algorithm is applied to decode a sequencethat has been encoded by the rate-1 accumulator, that this is equivalentto applying the Max-log-Map decoding to the same rate-1 accumulatorencoded sequence. A complexity analysis is performed on pages 36-37 ofthe same paper, and on page 37, it is shown that the Min-sum decoding ofthe rate-1 accumulator is requires ⅛ as much work as the Max-log-MapBCJR algorithm applied to the same rate−1 accumulator encoded sequence.

With the above result in mind, consider the hardware and computationalcomplexity needed to implement the each of the half iterations of a SISOiteration to decode the 4G LTE CTC the Studer reference. The Studerreference uses a radix-2 and a radix-4 Max-log-Map BCJR algorithm. The4G LTE trellis code is an eight-state trellis code. Thus decoding such atrellis code requires performing eight gamma branch metricscalculations, one for each of the 4G LTE CTC's eight states, plus eightforward alpha state metrics recursions, and eight backward beta statemetrics recursions, plus an LLR update to update the extrinsicinformation (3×8+1=25 vector operations of length K_(sub), where K_(sub)is the length of each of the N=8 trellis subsequences). Therefore theorder of complexity for decoding each of the N=8 trellis subsequences inof each of the two half iterations used in the Studer reference's ASICto perform the Max-log-Map BCJR algorithm is given by O(25 K_(sub)). Ascan be seen by equations (2), (3), and (4) in the Studer references,around 6 or so additions and compare-select-max type operations,operations on average. Hence in terms of actual operations performed byadders and/or compare circuits, a closer estimate of complexity would beO(150 K_(sub)). Additionally some LLR based arithmetic is needed (scalaroperations that do not add into the order of complexity calculations).

Next consider the complexity of first half iteration 1117 of the CTBCcode whose inner code has been selected to be the rate-1 accumulator.The rate-1 accumulator does not require and 8-state trellis decodingoperation but instead requires a 2-state trellis decoding operation.Changing the number “eight” to the number “two” in the above analysisgives a complexity of O(6 K_(sub)) if the Max-log-Map BCJR algorithm isto be used. However, as mentioned above in the Li reference, for thespecial case of decoding the rate-1 accumulator, the Max-log-Map BCJRalgorithm used in the Studer reference is equivalent to the Min-sumalgorithm and the complexity of the Min-sum algorithm is roughly ⅛ asinexpensive as compared to the Max-log-Map BCJR algorithm operating onthe same rate-1 accumulator. An inspection of tables I and II of the Lireference reveal that the comparative complexity to implement theMin-sum O(3K_(sub)) That is, the complexity to perform the first half ofthe SISO iteration 1117 is roughly (3/150)×100=2% as much work as isrequired to implement the first half iteration of the best current 4Ghardware that relies on 8-state Turbo decoding.

When the OBC can is selected to be a simple (8,4) Hamming code, thiscode will need to be soft decoded during the second half of the SISOiteration. As discussed in connection with FIG. 13, this (8,4) Hammingcode has only 16 codewords so can be efficiently optimally soft decodedin hardware. As per FIG. 14 below, it can be seen that each inputextrinsic LLR requires computations of blocks 1410, 1415, 1420, and1425. Adding up of the corresponding complexities of these blocks:16+16+1+1)=34. That complexity is per bit, and there are K_(sub) bits,so the complexity to perform the second half of the SISO iterationcorresponding to the OBC requires O(34K_(sub)). Now since(34/150)×100=23%, this provides about a 77% drop in computational loadas compared to the eight-state trellis decoding required by the CTC ofthe 4G LTE standard.

The first half iteration 1117, can thus be implemented using about a 2%as much computational complexity while the second half iteration of theSISO iteration can be implemented using about 23% computationalcomplexity. However, it can be noted that the hardware and memoryrequirements to implement both the first and second half iterations alsodrops considerably. The Min-sum algorithm requires three recursions, butnot on a per-state basis (see table I of the Li reference). Henceroughly eight times the state-metrics related memory requirement iseliminated as well. As shown in FIG. 13, the memory requirements of theentire block decoder is just 32 memory locations. The hardwarecomplexity of the functional unit 1300 and its operational program 1400is very simple as well. The different 8-bit block codes are 100%parallelizable within each parallel subsequence. Hence five of the verysimple functional units 1300 could easily be implemented in each one ofthe N parallel subsequence channels. These functional units couldoperate together using a circular buffer ordering. For example, fivesets of 8 extrinsic LLRs are read into of the functional units 1300, oneat a time. As soon as the data is loaded, the functional units beginprocessing. That is, the first functional unit could begin working whilethe second, third and fourth functional units were being loaded. Now theaddresser could back and unload the results from the first functionalunit and reload it with new a-priori extrinsic LLR values and move ontothe next functional unit. This would be occurring in all N=8 subsequencechannels. Hence with very little parallel hardware, the speed up wouldbecome more like 23/5=4.6% of the speed requirement for the second halfiteration, and so on.

Next consider receiver/decoder 1200 in further detail. A received signalis received and demodulated prior to being processed in a receivermetrics calculation block 1205. The block 1205 is typically preceded bya received signal demodulator to demodulate the received signal that ishas been modulated by a signal mapper that can include rate matching andspatial modulation components as are known in the art or as discussed infurther detail below in the context of additional aspects of the presentinvention. The block 1205 can reside off chip from the rest of thedecoder 1200, and can instead reside in one or more separate front-endcircuits/chips designed to demodulate and preprocess the receivedsignal.

The block 1205 computes a set of received signal metrics based upon thedemodulated received signal. In embodiments where signal preprocessingincludes rate matching, the receiver metrics calculation unit 1205typically inserts a signal metric into the received signal metricsstream to compensate for a signal value that was deleted due to ratematching in the transmitter. In a preferred embodiment, the insertedsignal metrics are set to zero, although other values couldalternatively be used. To avoid cumbersome language, it is to beunderstood that when describing the receiver/decoder 1200, when the term“receive metrics” is used, it is to be understood that from hereforward, this can refer to the inverse rate matched received signalmetrics.

The receive metrics calculation block 1205 couples its output receivemetrics to a receive metrics RAM block 1210. Associated with the receivemetrics RAM 1210 is a gamma branch metrics RAM 1220. The receive metricsRAM 1210 and the gamma branch metrics RAM 1220 may be merged into onememory embodiment as the receive metrics are typically used toinitialize the gamma metrics. The receive metrics/gamma metrics RAM 1220typically holds sets of gamma values, alpha values, and beta values. Theoutput of the RAM 1210/1220 is coupled to an M-level parallelgamma-branch metrics calculation engine 1215. In general, the blocks1210 and 1220 may be implemented as distributed sets of sub-memoriesthat are distributed and tightly coupled with (i.e., existing within) aset of specialized arithmetic-logic processing circuits within theM-level parallel gamma-branch metrics calculation engine 1215. Forexample, in the CTBC code example given discussed in connection with theCTC currently in use in the 4G LTE standard, there would preferably beN=8 processing clusters inside the M-level parallel gamma-branch metricscalculation engine 1215. Each of these sub-clusters would preferablycontain three sets of arithmetic-logic processing circuits each, one toupdate a set of alpha values (forward branch metric recursion), anotherto update a set of beta metrics (backward branch metric recursion) andanother to update a gamma value (gamma update recursion). Given thatonly a two-state trellis typically needs to be decoded by the M-levelparallel gamma-branch metrics calculation engine 1215, some of thesehardware units could be eliminated. For example one functional unitcould be used to compute both the forward and the backward statemetrics. If the Min-sum algorithm is used as discussed above to decodethe rate-1 accumulator, even more reductions are possible. With theMin-sum algorithm, the work required is as if there were only one statein the trellis. Hence significantly low complexity hardware can bedesigned. See Table I of the Li reference for a comparison.

Therefore the block 1220 would preferably include small RAM blockscollocated with the alpha, beta and gamma updating hardware. That is,M-level parallel gamma-branch metrics calculation engine 1215 ispreferably embodied using M sets of parallel processing circuits tightlycoupled and integrated with M different sub-memory modules that make upthe memory blocks 1210/1220. Methods of initializing the alpha beta andgamma values used in the various forms of the BCJR algorithm of eachsubsequence are well known to those of skill in the art. The receivemetrics are used to initialize the gamma metrics and the alpha and betametrics are thus initialized in a selected way as is known to those ofskill in the art for parallel SISO decoding of Turbo codes. For example,see the Studer and Roth references to understand some techniques thatwould be known to one of ordinary skill in the art as to how toinitialize the parallel trellis subsequences. Many algorithms can beused to perform the soft trellis decoding on the parallel subsequencesto decode the IRCC using M-level parallelism and finer grainsub-parallelism. That is, the blocks 1210, 1215, 1220 and 1225 can beconfigured to compute the operations of first half SISO iteration ascomputed in block 1117 of the CTBC decoding algorithm of FIG. 11. KnownSISO decoding algorithms such as MAP, Max-Log-MAP, Log-Map and SOVA(soft output Viterbi algorithm) can be used to compute these softtrellis decoding half iterations in the blocks 1117, 1210, 1215, 1220,1225. As mentioned above, when the rate-1 accumulator is used as theIRCC, the Min-sum algorithm can be used and has significantly lowerhardware and software memory requirements.

The decoder 1200 uses the 2D-array extrinsic LLR RAM 1240 to hold theupdated the extrinsic LLR values similar to the 2D memory array 1160 ofFIG. 11. The deterministic interleaver address generator 1245 and the2D-array extrinsic LLR RAM 1240 can be implemented using a structuresimilar to the memory array 710 in FIG. 7. The DCI address generatorblock 1245 generally corresponds to any combination or sub combinationof blocks 705, 715, 720, 725, and any additional optional LSB-addressgenerator blocks as discussed in connection with FIG. 7. The M×MInterconnect and constraint enforcer permutations block 1250 correspondsto 730 of FIG. 7 and optionally additional hardware to implement block1310 of FIG. 13 (optional). In the M×M interconnect and constraintenforcer permutations block 1250, the constraint enforcer permutations1310 may more generally come after the block 730, before the block 730,may be integrated into the M×M spatial permutation 730 itself, or anycombination thereof. Also, all double arrows shown in FIG. 12 canoptionally be implemented as 2M data-word-wide, bi-directional, dualported data paths having a dedicated set of M lanes of data words movingin both of the directions indicated by the double arrow. This is alsotrue of all the double arrows shown in FIG. 7. Hardware blockreplication for east and west bound traffic is similar, e.g., block 1250could be replicated to handle traffic moving to the left and moving tothe right in FIG. 12.

Note that the M×M interconnect and constraint enforcer permutationsblock 1250 couples (optionally using the 2M lane bi directional databusses as described above) to the 2D-Array extrinsic LLR RAM 1240 andalso to a processing array unit 1235 that includes both the M-levelparallel extrinsic LLR trellis update calculation engine 1225 and anM-level parallel extrinsic LLR soft block decode update calculationengine 1230. In ASCI designs certain functional units that are used intrellis decoding are reconfigured or controlled by a different setprogram instructions or control signals to switch over to a second modewhere they become engaged in block decoding SISO iterations as describedbelow. That is, blocks 1225 and 1230 are inside a general block 1235 inorder to indicate that certain hardware resources like functional unitscan be shared in a time division multiplexed fashion during the firstand second halves of the SISO iteration. Also, the reason that theoptional LSBs of the current extrinsic LLR address are shown as cominginto the processor block 1235 is to indicate that the processorsthemselves may be programmed or configured to perform constraintenforcement permutation operations that so far have been described asoccurring in the block 1250. This LSBs path could optionally carryadditional information beside the LSBs that relates to the interleavingfunction. Using this data/control path, the processors could becontrolled to read/write data elements stored in a local register bankin a predetermined order in order to enforce a pre-defined interleaverconstraints. A state machine generating control signals in the block1235 could cause extrinsic LLR values to be read into a local bufferaccessed by a functional unit, and that functional unit would processthose buffered elements in the prescribed order in accordance with a setof program instructions or hardware control signals.

Again referring to the M-level parallel extrinsic LLR updating engine1235, each of the M internal processing engines in the M-level parallelextrinsic LLR soft block decode update calculation engine 1230 may useone or more parallel functional units to also optimally soft decode aspecified block code such as an (8,4) Hamming code to update anextrinsic LLR value in the second half-SISO iteration. The optimal softblock decoding update is similar to the type of update that would becarried out in a half iteration of a SISO decoder configured to decode aturbo product code (TPC) (also known as block turbo code (BTC)). Asdiscussed in connection with FIG. 13 below, the decoding of the (8,4)hamming code can be implemented in very simple and efficient high speedparallel hardware.

The exemplary short (8,4) Hamming code can be optimally decoded usingthe approach that is well known to those of skill in the art and whichis outlined in outlined in C. Xu, Y-C Liang and W. S. Leon, “A lowcomplexity decoding algorithm for turbo product codes,” IEEE Radio andWireless Symposium, pp. 209-212, January 2007, “the Xu reference”herein. Longer block codes can also be soft decoded according to thealgorithms well known to those of skill in the art as taught in R. M.Pyndiah, “Near-optimum decoding of product codes: Block Turbo Codes,”IEEE Trans. Comm. Vol. 46, No. 8, August 1998, pp. 1003-1010 “thePyndiah reference herein.” Depending the length of the codeword used andother implementational factors, the M-level OBC SISO decoder 1230 can beconfigured to implement various well known forms of the above approachesfor soft decoding of block codes, for example, the Chase-Pyndiahalgorithm (also referred to as the Pyndiah algorithm), low complexityChase-Pyndiah algorithm, the OSD algorithm and its low complexityvariations, the sum of product algorithm (SPA), or any similar softdecoding algorithm for decoding of block codes, as are well known in thetechnical publications literature.

In operation, the receiver and decoder 1200 performs as described aboveand performs the same CTBC code SISO iterations as described in detailin connection with FIG. 11 using a memory architecture similar to theone described in connection with FIG. 7 and optionally the localconstraints enforcement permutation 1310. The main purpose of FIG. 12 isto show how a real time parallel CTBC SISO decoder can be implemented asa high speed ASIC or full custom VLSI chip, depending on the speedrequirements of the application.

Referring now to FIG. 13, an architecture for a functional unitspecifically designed to implement a soft decoder to soft decode a (8,4)Hamming code is provided. Note that this is an exemplary embodiment isdesigned to decode an OBC suitable for use in a 4G/5G type system thatis based on a CTBC code instead of the current 4G LTE CTC. This specificexample CTBC code uses the (8,4) Hamming code for the OBC, the rate-1accumulator for the IRCC, and uses a QPP based DCI for the interleaverportion. In general, FIG. 13 could be modified by those of skill in theart to soft decode other short simple codes that may be chosen for useas the OBC in a CTBC code in other embodiments of the present invention.Short codes as discussed in connection with the functional unit 1300 aresimpler to decode than a long code chosen for the OBC, for example a(72,64) BCH code, which was used in the Fonseka [3] reference in OTNapplications to meet the coding overhead requirements. If such a longcode is selected for the OBC, a low complexity Chase-Pyndiah, OSD orother such algorithm would be used. However, when a short code is usedas the OBC in a CTBC code, the circuit complexity to soft decode the OBCbecomes very small. Hence it is an aspect of the present invention touse a short code instead of a long code for the OBC, and to then performrate matching as disclosed herein in order to meet, for example, an OTNcoding overhead requirement or a 4G or 5G code rate requirement.

Also, as will be seen, since the memory and logic design of thefunctional unit 1300 is simple, so a more powerful functional unit couldbe created by chaining five or so such functional units together into aparallel functional unit embodiment whereby each parallel functionalunit can be loaded and unloading in a circular buffer ordering. By thetime the circle has completed in the circular buffer ordering, as onesub-functional unit 1300 is loaded the next functional unit (lastfunctional unit loaded mod 5) has its results ready to read out. Thisway, with very little hardware, the block decoding portion of the SISOiteration could be balanced with the IRCC decoding speed.

The design of methods and circuits to decode short codes like (8,4)hamming codes and the like are well known. Such techniques can readilybe used to design highly efficient soft decoders to decode one or morecodewords of the OBC in parallel for use in each parallel processingchannel of the M-level parallel LLR soft block code update calculationengine 1230 in FIG. 12. That is, in a preferred embodiment, the examplefunctional unit 1300 is repeated one or more times in each of the Msubsequence-channels inside the M-level parallel LLR soft block codedecode calculation engine 1230. In the case of the exemplary CTBC codedesigned for 8=way parallel systems, the functional unit 1300 would berepeated M=8 times, and as mentioned above, possibly more, for example,5×M=40 times in all.

In FIG. 13, a functional unit configured to soft decode an (8,4) Hammingcode OBC is provided. An extrinsic LLR input/output buffer 1305 iscoupled to one of the parallel lanes of the M-lane data bus shownbetween blocks 1250 and 1230 in FIG. 12. As mentioned before, each lanecarries a respective extrinsic LLR value, and in some embodiments thebus is implemented as a 2M lane bus and dual port memories and buffersare used so that data traffic can be sped up per clock cycle by avoidinghalf-duplex related data delays. The extrinsic LLR input/output buffer1305 is configured with either a single port or dual port bus interfaceas applicable depending on whether the M-lane or 2M-lane data bussingscheme is used on the double arrow to the left of block 1305, whereinput and output extrinsic LLRs are communicated to and from the block1250.

The extrinsic LLR input/output buffer 1305 is a very small RAM that onlyuses 16 RAM/register locations (the microsequencer can be configured sothat only 8 RAM/register locations are needed as will become apparentbelow). The extrinsic LLR input/output buffer 1305 is coupled to a verysimple arithmetic logic unit (ALU) that preferably performs, forexample, additions, subtractions, and compare-and-select-maxinstructions. A predetermined pattern generator 1315 is controllablycoupled to the ALU 1310. The ALU executes a small predetermined set ofinstructions to perform addition, subtraction, andcompare-and-select-max instructions, preferably using signed-numberfixed point arithmetic. The ALU executes these instructions in responseto the signals provided by the pattern generator 1315. The output of theALU 1310 is coupled to a dual accumulator/result register 1320. The dualaccumulator/result register 1320 includes an A-accumulator register anda B-accumulator register. The A- and B-accumulator registers are moregenerally A- and B-result registers that can generally hold anyintermediate results needed to be held in order to support computations.Another small RAM is the codeword metrics memory 1325. Because the (8,4)Hamming code only has 16 possible codewords, the codeword metrics memory1325 only requires 16 memory locations (i.e., registers). As can be seenfrom FIG. 13, the accumulator/result register 1320 has three feedbackpaths, one to the ALU 1310, another to the input of the codeword metricsmemory 1325, and another to the extrinsic LLR input/output buffer 1305.The contents of both the A-accumulator and the B-accumulator can be fedback via these three feedback paths. In some embodiments, separatefeedback data busses (2 lanes) could be provided so the contents of theA-accumulator and the B-accumulator could be fed back at the same time.In such embodiments, a buffer and a multiplexer would be preferablysupplied at the three respective inputs to the blocks 1305, 1310, and1325. Also, the ALU has three data path inputs, one from the extrinsicLLR input/output buffer 1305, another from the codeword metrics memory1325, and another from the accumulator/result register 1320.

The functional unit 1300 also includes its own equivalent of a programmemory, but this program memory is preferably implemented as a programlogic microsequencer 1330. In some embodiments, some or all of thisprogram logic microsequencer 1330 can be shared by all M of thefunctional units 1300 since most of the time they are executing exactlythe same sequence of operations. In many embodiments, little or noinstruction decoding is needed because the microsequencer 1330 can beconfigured to act as a pattern generator state machine that sequencesthrough a set of states whose state outputs are a set of control signalsthat cause the different registers to be read and written in a specifiedorder as discussed in more detail in connection with FIG. 14. Themicrosequencer 1330 can optionally be implemented in distributedhardware so that the state output control signals reside in logic ormemory located right next to the register file or hardware device beingcontrolled. If a microsequencer is not used, the functional units caneach perform instruction decoding and all M of the functional unitscould decode the same instruction stream, and separate control issuescould be handled using local condition codes as is known in the art. Thepattern generator 1315 is an example of a distributed portion of theoverall microsequencer 1330 in the embodiment shown of the functionalunit 1300. Also, although not shown, the microsequencer 1330 can receiveexternal control signals from other parts of the decoder 1200 such asthe control signals from the M-level parallel extrinsic LLR soft blockdecode update calculation engine 1230 and change states accordingly toimplement other functions such as outputting converged total LLR valuesas data values instead of outputting a-posteriori extrinsic LLR values.

To understand the operation of the (8,4) Hamming code soft decodefunctional unit 1300, consider the method/process 1400 of FIG. 14 whichis tightly associated with the operation of the functional unit 1300 andthe control sequences that emanate from the microsequencer 1330. Inoperation, at 1405, the extrinsic LLR input/output buffer 1305 receives,one at a time, eight different a-priori extrinsic LLR values. Theseeight extrinsic LLR values are stored into the first eight locations ofthe extrinsic LLR input/output buffer 1305.

Next at 1410 each of the eight LLR values is sent to the ALU in acircular buffer order, (i.e., LLR1, LLR2, . . . LLR8, LLR1, LLR2, . . .LLR8, . . . ) until all eight extrinsic LLRs have been cycled out to theALU sixteen times. Each time a set of the eight stored LLR values isreceived in sequence at the ALU 1310, the pattern generator 1315generates a respective sequence of eight bits corresponding to arespective one of the sixteen possible (8,4) Hamming codewords. Beforethe first set of the eight extrinsic LLR values is sent to the ALU 1310,accumulator-A of the accumulator/result register 1320 is set to zero.Next the eight extrinsic LLRs stored in the extrinsic LLR input/outputbuffer 1305 are sequenced in order to the ALU 1310. As each i^(th)extrinsic LLR, for i=1, . . . , 8, is received at the ALU 1310, thecorresponding i^(th) bit of the first Hamming codeword is output fromthe pattern generator. If the i^(th) bit of the first Hamming codewordis a one, the corresponding LLR is added by the ALU 1310 to theA-accumulator of the block 1320 and the result of the addition is storedback into the A-accumulator. If the i^(th) bit of the first Hammingcodeword is a zero, the LLR is subtracted from the A-accumulator by theALU 1310, and a result of the subtraction is stored back into in theA-accumulator. After all eight LLRs have been processed this way, theresult of the A-accumulator is written into the first position in thecodeword metrics memory 1325. This process is then repeated for j=2, . .. , 16, once for each of the remaining 16 unique Hamming codewordsassociated with the 16 unique 8-bit Hamming codewords of the (8,4)Hamming code. That is, as the above periodic sequence of extrinsic LLRsare clocked in a circular buffer fashion out of the extrinsic LLRinput/output buffer 1305, the pattern generator, in synchronization,clocks out the set of sixteen (8,4) Hamming codewords, and the ALUresponds to 1's as add commands and 0's as subtract commands. Theprogram logic microsequencer 1330 sends out control signals to controlthe circular-buffer reading order of the extrinsic LLR input/outputbuffer 1305, and to control the writing of the A-accumulator results tothe codeword metrics memory 1325 after the eight extrinsic LLRs areprocessed this way each of the 16 times.

In the process above, if 2-lane bussing and dual ported register filesare used inside the functional unit 1300, then the process can besequenced to ping-pong between using the A-accumulator and theB-accumulator so that the a result can begin accumulating in theB-accumulator while the A-accumulator is being written out. Such lowerlevel optimizations can be used throughout the decoder 1200 to saveclock cycles wherever desired.

The process 1400, generally as carried out in accordance with theprogram logic microsequencer 1330, next advances to the sub-process 1415in FIG. 14. In the sub-process 1415, a total LLR metric will bedetermined for each of the eight bit positions of each received a-prioriextrinsic LLR stored in the input portion of the extrinsic LLRinput/output buffer 1305. Due to a property of the (8,4) Hamming code,for each of the i=1, 2, . . . 8 bit positions of the (8,4) Hammingcodeword, of the 16 valid codewords, eight valid codewords will have azero in the i^(th) bit position, and eight valid codewords will have aone in the i^(th) bit position.

Therefore, in accordance with 1415 as enforced by the microsequence 1330and the pattern generator 1315 (which in general may be implemented as apart of the microsequencer 1330), a set of total LLRs will be computed.To begin, the A-accumulator and the B-accumulator are set to the mostnegative number representable by the signed fixed point numbering systemused by the ALU 1310. Starting with the first bit position, the eightbit metrics corresponding to the eight (8,4) Hamming codewords that havea one in the first bit position are sequenced out of the codewordmetrics block 1325 and are coupled to the ALU 1310. As each new codewordmetric arrives at the ALU 1310, the pattern generator 1315 sends acontrol signal that causes the ALU to compute a compare-and-select-maxinstruction, comparing the incoming codeword metric with the contents ofthe A-accumulator and storing the max value back into the A-accumulator.After this has been performed eight times for all eight of the selectedcodeword metrics, the A-Accumulator will be left with the maximum of thecodeword metrics that correspond to codewords that have a one in theirfirst bit position. Next, staying with the first bit position, the eightbit metrics corresponding to the eight (8,4) Hamming codewords that havea zero in the first bit position are sequenced out of the codewordmetrics block 1325 and are coupled to the ALU 1310. As each new codewordmetric arrives at the ALU 1310, the pattern generator 1315 sends acontrol signal that causes the ALU to compute a compare-and-select-maxinstruction, comparing the incoming codeword metric with the contents ofthe B-accumulator. After this has been performed eight times for alleight of the selected codeword metrics, the B-Accumulator will be leftwith the maximum of the codeword metrics that correspond to codewordsthat have a zero in their first bit position.

Next in accordance with the sub-process 1420 of FIG. 14, themicrosequencer causes the contents of the A-accumulator and theB-accumulator to be fed back to the input of the ALU 1310 while thepattern generator outputs a command telling the ALU to perform asubtraction and to put the result of the subtraction into theA-accumulator. The output of this subtraction is also stored back intothe first one of the as yet unused eight (optional) locations of the14-element extrinsic LLR input/output buffer 1305.

Next in accordance with the sub-process 1425 of FIG. 14, the firsta-priori extrinsic LLR from the extrinsic LLR input/output buffer 1305is fed to the ALU and control signals are generated to cause this firsta-priori extrinsic LLR to be subtracted from the A-accumulator andwritten back to the first location on the extrinsic LLR input/outputbuffer 1305. This value corresponds to the a-posteriori extrinsic LLRvalue to be used in the next iteration. This process then repeated forall the remaining six bits, one at a time.

The sub-processes 1415, 1420, and 1425 have only been described for thefirst bit position. However, the same sub-processes 1415, 1420, and 1425also sequence to be carried out for the remaining bit positions, i=2, .. . , 8. The a-posteriori extrinsic LLR values are sent back to the 2Dmemory 1240 to be used as a-priori extrinsic LLR values in the firsthalf of the next SISO iteration. Additionally, the total LLR values maybe used as a part of a stopping criterion. As SISO iterations continue,the total LLR values converge to the (8,4) Hamming codewords. The threeparity bits can be discarded and the four data bits from each wordcorrespond to the output sequence of the SISO decoder 1200.

In an alternative embodiment, one micosequencer is used. The Kfunctional units 1300 are sequenced to generate K answers in parallelinstead of the pipelined approach Mod 5. Also, more parallelism can beextracted at the 1300 level, for example, 16 ALUs can be configured tooperate in parallel. That is, both higher level parallelism and lowerlevel parallelism within the functional units can be extracted usingsingle instruction multiple data or multiple instruction multiple datacontrol.

Constrained Interleaved Coded Modulation(CICM):

CTBC codes can be designed to provide both high MHD and high interleavergain. When a CTBC code is transmitted through a Gaussian channel usingBPSK signaling with constellation points at ±a or using Gray coded QPSKsignaling with constellation points at {±a, ±a}, the CTBC code'sMHD=d_(t) translates directly to a Minimum Squared Euclidean distance(MSED) of D_(min) ²=4a²d_(t). When this same CTBC code is transmittedthrough a Gaussian channel using a larger Gray coded signalconstellation where the minimum squared Euclidean distance between twoconstellation points is 4a², then the CTBC code's MHD=d_(t) alsotranslates directly to a Minimum Squared Euclidean distance (MSED) ofD_(min) ²=4a²d_(t).

Bit interleaved coded modulation (BICM) as is known in the art can beused to map the coded bits of an underlying code via an interleaver insuch a way as to spread neighboring coded bits onto different symbols.The BICM interleaver is typically selected to be a uniform interleaver.BICM is known to perform better in fading channels because it can spreadthe neighboring coded bits of the underlying code onto differentsymbols.

“Constrained interleaved coded modulation” (CICM) is developed herein inaccordance with an aspect of the present invention to map CTBC codesonto various sized signal constellations. As can be seen from the CI-3and CI-4 design approaches, the complete set of low weight errorsequences that dominate error performance (e.g., CTBC codewords withweights d_(t)<d≦d_(f), that correspond to the sequences i_(P) in thetables P(d_(f)≧d≧d_(t))) can be readily identified and enumerated. Thisallows CICM mapping rules to be designed to provide MSED advantagessimilar to Ungerboeck's trellis coded modulation (TCM). Also, similar toBICM, the CICM interleaver is preferably designed to spread the non-zerocoded bits of the identified low weight CTBC codewords onto differentsymbols (i.e., constellation points) transmitted during different symbolintervals, and this leads to improved performance over fading channels.

CICM can be viewed as a two step mapping process. The first stepinvolves identifying a constellation mapping rule to map subsets of mcoded bits onto constellation points. The coding policy preferablyassigns high distances between constellation points that differ by asingle bit and progressively smaller distances between constellationpoints that differ by more bits up to m-bits. In a sense, this is theopposite of Gray coding which assigns low distances betweenconstellation points that differ by a single bit and progressivelylarger distances between constellation points that differ by more bitsup to m-bits. For this reason, the constellation mapping policiesdiscussed herein for use with CICM are called “Reverse Gray Coded” (RGC)constellation mapping policies. The second step involves determining aCICM permutation function (interleaver rule) for use within the CICMmapper. If the frame size is big enough, the CICM interleaver can bedesigned to spread each possible pattern of d_(t) non-zero coded bits ofeach of the identified lowest weight (weight d_(t)) CTBC codewords ontod_(t) different symbols. Also, the permutation can be designed to ensurethat changes in the values of each of these d_(t) non-zero coded bitscorrespond to respective large Euclidian distances on the constellation.Thus a “CICM mapping rule” includes a CICM permutation rule followed bya selected constellation mapping rule. A “CICM signal mapper” includes aCICM permutation Γ (a different type of constrained interleaver ascompared to the CI-3 or CI-4 type constrained interleavers, π) followedby a selected constellation mapper such as a RGC constellation mapperfor a given 2^(m)-ary signal constellation.

To better understand the mapping rule, consider the QPSK example of FIG.15. In this example, the first step involves defining a constellationmapping rule that maps groups of m=2 bits onto QPSK symbols in such away that a single bit change of the most significant bit is associatedwith a squared Euclidian distance of 8a², and a single bit change in theleast significant bit is associated with a squared Euclidian distance of4a². Note that the QPSK constellation that uses the reverse Gray codedconstellation mapping rule as shown in FIG. 15 is not unique. Forexample, the 10 and 01 labels on the lower two constellation pointscould be swapped, and that would cause changes in second bit instead ofthe first bit to correspond to the higher distance on the constellation.RGC is the same as another type of constellation mapping known as“anti-Gray coding” for a QPSK constellation, but RGC differs fromanti-Gray coding as the constellation size grows larger. The second stepinvolves defining a CICM interleaver rule that places each set of thed≧d_(t) non-zero coded bits, associated with each identified weightd≧d_(t) CTBC codeword, onto d≧d_(t) different QPSK symbols. The CICMpermutation will also be preferably designed to place all of theseidentified non-zero coded bits into the most significant bit positionsin each of these QPSK symbols so as to cause the MSED to achieve D_(min)²=8a²d_(t) as opposed to the D_(min) ²=4a²d_(t) achieved by Gray codedQPSK.

The minimum symbol Hamming distance, d_(s), is the minimum number ofsymbols onto which the non-zero coded bits of any coded sequence, v,will be mapped. For example, if each of the d_(t) non-zero-coded bits ofa weight d_(t) CTBC codeword are mapped onto separate respectivesymbols, then d_(s)=d_(t). The maximum achievable d_(s), denotedd_(s,max), results when all the non-zero coded bits in every weightd_(t) sequence of v are placed into different symbol intervals, so thatd_(s,max)=d_(t). On the other hand, if the size of the signalconstellation is M=2_(m), the lowest possible d_(s,min), i.e., resultsif a coded sequence with weight d_(t) is allowed to feed all its d_(t)bits into only ┌d_(t)/m┐ number of m-bit symbols. In the worst case, theweight d_(t) sequence of v feeds its non-zero coded bits into┌(d_(t)/m)−1┐ symbols completely and feeds any of its remaining bitsinto one other symbol. Hence, d_(s,min)=┌d_(t)/m┐, and the achievabled_(s) satisfies, ┌(d_(t)/m)┐≦d_(s)≦d_(t). The CICM interleaver rule isdesigned to achieve the highest possible target value of d_(s), denotedas d_(s,t), subject to the constellation size, M, and the frame size, K.

In order to achieve any target symbol Hamming distance d_(s,t), inaddition to observing only weight d_(t) sequences of v, it is alsonecessary to ensure that every higher weight sequence of v also resultsin at least a Hamming symbol weight of d_(s,t). Specifically, if thesize of the signal constellation is M=2^(m), to achieve a symbol Hammingdistance of ┌d_(t)/m┐<d_(s,t)≦d_(t), it is necessary that all valid CTBCcodewords, v, with Hamming weight up to d_(w)=m(d_(s,t)−1) result in asymbol Hamming distance of at least d_(s,t). Because every symbol isformed by m bits, a coded sequence v with weight d>d_(w)=m(d_(s)−1) isguaranteed to feed its bits into at least d_(s,t) symbols. Therefore, toachieve the target value, d_(s,t), the non-zero coded bits of all lowweight CTBC codewords with weight up to d_(w) need to be placed in sucha way as to achieve the target symbol Hamming distance, d_(s,t). AllCTBC codewords with weight higher than d_(w) will thus be guaranteed tohave a symbol Hamming distance greater than or equal to d_(s,t).

Next consider how to achieve a target MSED. If the minimum squaredEuclidean distance between any two constellation points is 4a², since asymbol is formed by m bits, every subset of m bits of v contributes atleast 4a² to the squared Euclidean distance of that sequence and thusany weight d sequence of v is guaranteed to have a squared Euclideandistance of at least 4a²┌d/m┐. Therefore, at the sequence level, inorder to maintain an MSED of D_(min) ², it is necessary to make surethat all sequences of v with Hamming weight from d_(t) and up tod_(e)=└mD_(min) ²/4a²┘ achieve the selected MSED of D_(min) ². Here“d_(e)” denotes the Hamming weight that is needed to meet the targetMSED, and the subscript e denotes Euclidian. In order to ensure that theCICM mapping rule achieves a target symbol Hamming distance and a targetMSED, it is necessary to consider all sequences v with weights startingfrom d_(t) and up to d_(f)=max{d_(w),d_(e)}. Here “d_(f)” denotes thefinal Hamming distance that is needed to meet both the target minimumHamming distance d_(s), and target MSED D_(min) ², as described above inconnection with d_(w) and d_(e). In most practical cases, d_(e)>d_(w) sothat d_(f)=d_(e).

The CICM interleaver constraints assume that the low weight CTBCcodewords can be enumerated according to their weights. Recall that theCI-4 design algorithm identifies and eliminates all low weight CTBCcodewords whose weights are less than d_(t). Similarly, an analysis runof the CI-4 design algorithm can be used to identify all of the lowweight CTBC codewords at any desired Hamming weight d≧d_(t). All suchlow weight CTBC codewords, enumerated as i_(P)=0, . . . ,N_(P)(d≧d_(t))−1, where N_(P)(d≧d_(t)) is the number of unique positionsin the table P(d≧d_(t)), can thereby be identified by a listing of theirrespective positions vectors, p(i_(P)) into table P(d≧d_(t)), where eachpositions vectors, p(i_(P)), lists the positions of “1”s (i.e., non-zerocoded bits) of a respective weight d sequence, v(i_(P)). The tableP(d≧d_(t)) can be viewed as being built up as a sequence of constituenttables, {P(d)}, which each constituent table tabulates all of thepositions vectors, p(i_(P)), associated with respective CTBC codewordswith a respective weight, d. That is, P(d≧d_(t))={P(d_(t)), P(d_(t)+1),. . . , P(d)}. The number of elements each positions vector, p(i_(P)),has is equal to the weight of its associated CTBC codeword, v(i_(P)),which is denoted as d(i_(P)). In any constituent table P(d), eachpositions vector, p(i_(P)), in the table P(d) can be enumerated andreferred to as i_(P)=0, . . . , N_(P)(d)−1. Herein, the “sequence i_(P)”is used to generally refer to the positions vector, p(i_(P)), and/or theassociated the low weight coded sequence, v(i_(P)).

The CICM mapping rule involves: (a) selection of a constellation mappingpolicy to map each in-bit combination of coded bits onto a respectiveconstellation point, and (b) selection of the CICM interleaver rule topermute the coded bits of the vector v, subject to the constraint that,once mapped, the CICM mapped sequence will exhibit the best set oftarget values of d_(s,t), and D_(min) ² that can be achieved for a givenframe size. The CICM interleaver rule can be viewed as a constrainedinterleaver whose constraints involve placing all of the non-zero codedbits of the low weight sequences identified in the Table P(d≧d_(t)) insuch a way as to enforce: (a) the target minimum symbol Hamming distanced_(s,t), and (b) the target squared MSED, D_(min) ². In practice, aniterative algorithm will be used that will be initialized with themaximum possible d_(s,t)=d_(s,max)=d_(t) and the maximum possibleD_(min) ²=D_(min,max) ² for the selected signal constellation and itsconstellation mapping rule. Using these values of d_(s,max) andD_(min,max) ², starting values for d_(w), d_(e), and d_(f) are nextcomputed using the formulas provided above. Next, subject to theselected constellation mapping rule and the specified frame size, K, itis attempted to construct a CICM interleaver rule that meets theinterleaver constraints for d_(s,max) and D_(min,max) ². If the framesize is too small, the target d_(s,t) and D_(min) ² values will beincrementally lowered and the design process will be repeated until avalid CICM interleaver rule is found to achieve the final values ofd_(s,t), and D_(min) ².

To design the CICM interleaver rule, an m×K/m permutation matrix, Γ, isdefined. Each column of Γ can be considered to correspond to arespective symbol interval. The individual elements of Γ can beconsidered to be permutation indices pointing back into the vector v.Each column of Γ thus contains the indices of the coded bits from v thatneed to be constellation-mapped onto a symbol in each symbol interval.Similar to the CI-4 design approach, a “coded bit position” in videntifies a physical memory location, i, in the vector v, where0≦i≦K−1. A “position” typically is used to refer to an index, i, in v,where a respective nonzero coded bit (i.e., a “1”) occurs in arespective one of the low weight error sequences identified by the tableP(d≧d_(t)). Also, while the elements of the permutation matrix Γ areactually indices into the vector v, similar to the discussion of theCI-4 design process, the concept of “placing” a coded bit (position)from v into Γ will be used herein.

CICM Mapping Rule Design Algorithm:

To begin, the same sequential bit placement approach as used in the CI-4design algorithm can be used to identify all of the coded sequences vwith weight d, starting with d=d_(t). For example, once the CI-3 and/orCI-4 (or DCI) interleaver is designed, the same bit-placing ordering asused in the CI-4 design algorithm can be followed and Algorithm 1 can becalled, but by replacing d_(t) with d≧d_(t) to identify all of the CTBCcodewords having weight d. That is, an analysis run as described abovecan be performed, and this analysis run will cause Algorithm 1 toenumerate all possible CTBC coded sequences with weights d≧d_(t). Theresults of the analysis run can be used to create the table, P(d≧d_(t))which tabulates all of the positions vectors of all of the respectiveCTBC codewords of weights d≧d_(t). The table P(d≧d_(t)) can be readilysub-divided into a set of constituent tables, P(d_(t)), P(d_(t)+1), . .. , P(d), which each respectively list all of the positions vectorscorresponding to the CTBC codewords that exists at each respectiveweight, d_(t), d_(t)+1, . . . , d.

In the analysis runs of the CI-4 design algorithm, the bits of c willalready have been placed into u in such a way as to ensure that no CTBCcodewords with weight less than d_(t) will exist. In each analysis run,no bits are placed, but all of the positions vectors identified byAlgorithm 1 corresponding to the CTBC codewords with the weight d aretabulated into the Table P(d). As will be seen later, it is useful toalso tabulate information that identifies the contents of the non-zeroOBC codeword positions, {c_(j)} of the c vector associated with eachtabulated sequence, i_(P).

Given the table P(d), a set E(d) is defined to be a set whose membersare the distinct positions that appear in any of the positions vectorscontained in P(d). The number of elements in the set E(d) is denoted asN(d). The number of times a given position, i, occurs in E(d) is denotedas Popularity(i,d). For example, if position v(50) only occurs in one ofthe sequences in the Table P(d), then the index value i=50 would beincluded in E(d), the i=50 index would be counted once in N(d), andPopularity(i=50,d)=1. If position v(55) occurs in ten different ones ofthe sequences in Table P(d), then the index value 1=55 would be includedin E(d), the 1=55 index would be counted once in N(d), andPopularity(i=55,d)=10. Note that if a given position, i=70, is not usedto hold any non-zero coded bits of any low weight sequences listed inTable P(d≧d_(t)), then popularity of i=70 at this weight of d is zero,i.e., Popularity(i=70,d)=0.

The iterative CICM mapping rule design algorithm will attempt to placeall the positions of v into Γ to achieve the maximum possibled_(s,t)=d_(s,max)=d_(t) and the maximum possible D_(min) ²=D_(min,max)². However, the values of the parameters such as d_(t), the frame size,K, and the constellation size, M=2^(m) will determine the actual highestpossible values of the targets d_(s,t) and D_(min) ² that can actuallybe reached. Specifically, if the signal constellation size is M=2^(m),the CICM mapping rule design algorithm computes the associated value ofd_(f), and then starts off by considering only Hamming weight d=d_(t)sequences in v. Next the design algorithm gradually increases d until alimiting condition is reached or until the d_(s,t)=d_(s,max)=d_(t) andD_(min) ²=D_(min,max) ¹ objectives are achieved with the final value ofd=d_(f). In the event that the d_(s,max)=d_(t) and D_(min,max) ²objectives cannot be achieved, then d_(s) and/or D_(min) ² are decreasedto achieve the next highest possible values of d_(s,t) and D_(min) ². Asdiscussed in further detail below, the amount by which d_(s,t) and/orD_(min) ² are decreased depends on the maximum number of coded bits froma weight d sequence that will need to be loaded into any particularsymbol, and the positioning of those bits on different symbols. Next anew (lower) value of d_(f) is calculated, and the process is repeated,building the table P(d≧d_(t)), for each d=d_(t), d_(t)+1, . . . , d_(f),and attempting to place all the positions of v from each constituenttable P(d) into Γ to achieve the current (lowered) values of d_(s,t) andD_(min) ². If the mapping is able to achieve the current values ofd_(s,t) and D_(min) ² all the way up to P(d_(f)), then the algorithmstops. Otherwise then d_(s,t) and/or D_(min) ² are decreased again, andthe design process is repeated until a valid CICM interleaver rule canbe found to achieve a final pair of target values of d_(s,t), andD_(min) ² at the specified frame size, K.

Without loss of generality, the CICM mapping rule design algorithmcomputes the normalized squared Euclidean distance by dividing it by theMSED on the constellation itself (which is 4a²), i.e., the normalizedsquared Euclidean distance is given by D_(en) ²=D_(e) ²/(4a²). Thisnormalization is slightly different from the standard squared normalizedEuclidean distance used in the literature given by D²=D_(e)²/(2E_(b,avg)), or the normalized squared MED d_(min) ²=D_(min)²/(2E_(b,avg)), which also takes into account of the number of bitstransmitted per interval, where E_(b,avg) is the average bit energy.

As the iterative design algorithm proceeds, certain quantitiesassociated with individual sequences, i_(P)=0, . . . , N_(P)(d≧d_(t))−1,as listed in each Table P(d≧d_(t)) can evolve. The quantitiesd_(s,temp)(ip_(P)) and D_(en,temp) ²(i_(P)) respectively represent thecontributions to the symbol Hamming distance and to the normalizedsquared Euclidean distance due to the already placed positions of i_(P).The quantities d_(s)(i_(P)) and D_(en) ²(i_(P)) respectively representthe actual symbol Hamming distance and the actual normalized squaredEuclidean distance of the low weight sequence, i_(P), once it hasfinished being placed into Γ. The quantities d_(s,max)(i_(P)) andD_(en,temp)(i_(P)) respectively represent the maximum possible valuesthat d_(s)(i_(P)) and D_(en) ²(i_(P)) can possibly achieve for each lowdistance error sequence i_(P) as listed in the Table P(d). These maximumpossible values, d_(s,max)(i_(P)) and D_(en,max) ²(i_(P)), are thevalues reached by the sequence i_(P) based on its already placedpositions in Γ, assuming that its remaining positions can be placed in Γso as to meet the CICM interleaver constraints. Once a sequence i_(P) isfully placed in accordance with the CICM interleaver constraints,d_(s,temp)(i_(P))=d_(s)(i_(P))=d_(s,max)(i_(P)) and D_(en,temp)²(i_(P))=D_(en) ²(i_(P))=D_(en,max) ²(i_(P)). The maximum achievabled_(s,t) and D_(min,n) ² can be calculated at any point asd_(s,t)=min{d_(s,max)(i_(P))} and D_(min,n) ²=min{D_(en,max) ²(i_(P))}.

CICM Mapping Rule Design Algorithm: QPSK Example

By way of example, consider the QPSK example using the coding policy asillustrated in FIG. 15. The steps below explain how to design the CICMinterleaver rule, Γ, at a specified frame size, K, for the specific QPSKconstellation and coding policy as illustrated in FIG. 15.

Step 1. Set d=d_(t) and perform an analysis run of the CI-4 designalgorithm to identify all weight d_(t) CTBC codewords {v(i_(P))}, fori_(P)=0, . . . , N_(P)(d_(t))−1, and tabulate their respective positionsvectors, {p(i_(P))}, into the table P(d_(t)). Form the set of alldistinct positions of the set of all weight d_(t) CTBC codewords,E(d_(t)), and find the number of elements in E(d_(t)), N(d_(t)), and thePopularity(i, d_(t)) for each position, i, in E(d_(t)). Arrange theelements of E(d_(t)) in the descending order of their popularity, i.e.,the first element in the set E(d_(t)) appears most in all sequences inP(d_(t)) and the last element appears least. At this time, there is noinformation that suggests that d_(s,max) and D_(min,max) ² cannot bereached, so in order to aim for the highest possible targets d_(s,t) andD_(min,n) ², initialize each tabulated sequence, i_(P), as follows:d_(s,max)(i_(P))=d_(sm)=d_(t), d_(s,t)=min{d_(s,max)(i_(P))}=d_(t),D_(en,max) ²(i_(P))=D_(e,max) ²/4a²=8a²d_(t)/4a²=2d_(t), and D_(min,n)²=min{D_(en,max) ²(i_(P))}=2d_(t).

If K/2≧N(d_(t)), then all elements of E(d_(t)) can be placed on thefirst row of Γ. The first row of Γ contains the most significant bits ofeach of the K/m different m-bit symbols that are stored down the columnsof Γ (m=2 in this example). Each of these most significant bits have asquared Euclidian distance of 8a² in the example of FIG. 15. IfK/2≧N(d_(t)), place each of the unique positions as listed in E(d_(t))onto the first row of Γ. In this QPSK example, it will be assumed thatthe positions in the set E(d_(t)) will be placed into the first row inthe same popularity-ranked order as they occur in the set E(d_(t)).Alternatively, a random number generator could be used to assign themembers of the set E(d_(t)) (i.e., the positions, i, of the “1”s in theweight d=d_(t) CTBC codewords, v) to column numbers of Γ. Otherorderings could also be used, as discussed in further detail below. Ifthe frame size K is large enough to assign all of the elements ofE(d_(t)) to the first row of Γ, the highest possible target value ofd_(s,t) will have been achieved, so that d_(s,t)=d_(sm)=d_(t), and thehighest possible D_(min) ²=8a²d_(t) will also have been achieved, sothat D_(min,n) ²=D_(min,n,max) ²=2d_(t). This means that coded sequenceswith weight d_(t) alone do not force d_(s,t) and D_(min) ² to be loweredand the interleaver constraints have so far been met. Further, ifK/2≧N(d_(t)), go from here directly to step 3.

Step 2. However, if K/2<N(d_(t)) some elements of E(d_(t)) will need tobe placed on the second row of Γ. This suggests that it will not bepossible to achieve a D_(min,n) ² of D_(min,n,max) ²=2d_(t) becausethere will be at least one coded sequence with weight d_(t) that cannotplace all its non-zero positions on the first row (i.e., the mostsignificant bit in FIG. 15). Under these conditions, define a subset, H,H⊂E(d_(t)), that contains the (N(d_(t))−K/2) positions from the setE(d_(t)) that will need to be placed onto the second row of Γ. It isdesirable to determine the subset H that lowers the D_(en) ²(i_(P))values of the fewest number of sequences, i_(P). Therefore, the subset His selected to include the least popular (N(d_(t))−K/2) positions ofE(d_(t)), and these positions will be tabulated at the end of E(d_(t))since the elements of E(d_(t)) are rank ordered from highest to lowestpopularity. Once this subset H is identified, place the first K/2 uniquepositions as listed in the set E(d_(t)) directly on the first row of Γusing the existing ordering of the set E(d_(t)).

Next the positions of the subset H need to be placed onto the second rowof Γ in such a way as to achieve the highest possible value for d_(s,t).In the specific example of FIG. 15, since m=2, there is only oneremaining position in each column, i.e., row 2, but in general there are(m−1) rows below the first row. In order to maintain the highestpossible value of d_(s,t), the elements of E(d_(t)) should be placed insuch a way that no two positions of any sequence, i_(P), in P(d_(t)) areplaced into the same column. This is preferably done by filling thecolumns of Γ one by one as described below.

Starting with the column whose first element contains the position, i,in E(d_(t)) whose popularity, Popularity (i, d_(t)), is the highest, thedesign algorithm attempts to match this position with the position(s) inthe subset H that have the highest popularity. This is because thehigher a position's popularity, the more potential conflicts it willhave when being considered for placement into any candidate column, andthus the more difficult it will be to place later when there are not toomany vacant locations left in Γ. Therefore, the design algorithm placesthe more difficult positions first and leaves the easier to placepositions with lower popularities for later.

To accomplish the above, because of the popularity-rank ordering inwhich the first K/2 positions of E(d_(t)) have been placed into thefirst row of Γ, the (1,1) position in Γ (the first position in the firstrow) will have the highest popularity. Next identify the position in thesubset H with the highest popularity that is not a position of anysequence i_(P) that contains the (1,1) position. Due to the way thatE(d_(t)) has been rank ordered according to popularity, this can be doneby checking each position of H from left to right and selecting thefirst position in H that is not a position of any sequence, i_(P), thatcontains the position stored in the (1,1) location of Γ. Place theidentified position of the subset H below the (1,1) location of Γ (i.e.,the (2,1) location). Continue in this way by moving from column tocolumn along the first row of Γ until all positions in H are placed intothe second row of Γ in such a way that no column contains more than oneposition from any given sequence i_(P) in E(d_(t)). If this can besuccessfully done, it is still possible to achieve d_(s,t)=d_(sum)=d_(t)based on all weight d_(t) coded sequences. If all positions of H cannotbe placed in such a way that no column contains more than one positionfrom any given sequence i_(P), one or more roll-back attempts asdiscussed above in connection with the CI-4 design algorithm can bemade, but if the roll-back attempts fail, two positions of the samesequence i_(P) will have been placed in at least one column of Γ, andthus d_(s,t) must be lowered. Therefore, if two positions of the samesequence had to be placed in at least one column of Γ, update all of thed_(s)(i_(P)), D_(en,max) ²(i_(P)), d_(s,t), D_(min,n) ² and d_(f)values.

Next step 3 is executed to place the any remaining positions of v. Ifd_(s,t) had been lowered below d_(sm)=d_(t), then if necessary, some ofthe positions that were initially placed on the first row with the aimof achieving d_(s,t)=d_(t) can be judiciously removed from Γ to createroom for the remaining positions of v as discussed in further detailbelow.

Step 3. Set d=d+1 and perform an analysis run of the CI-4 designalgorithm to identify all of the CTBC codewords having weights d, andtabulate their respective positions, {p(i_(P))}, i_(P)=0, . . . ,N_(P)(d)−1, into the table P(d) and use these identified sequences toupdate the table P(d≧d_(t)). Next identify the positions that havealready been placed in Γ, and using these already-placed positions,calculate d_(s,temp)(i_(P)) and D_(en,temp) ²(i_(P)) for sequence,i_(P)=0, . . . , N_(P)(d)−1, listed in the table P(d). Thed_(s,temp)(i_(P)) and D_(en,temp) ²(i_(P)) values represent the symbolHamming distance and the Euclidean distance contributions respectivelymade by the already placed positions of each of the N_(P)(d) weight dCTBC codewords identified by table P(d). Furthermore, the values ofD_(en,temp) ²(i_(P)) are calculated only using positions from thepositions vectors p(i_(P)) of table P(d) that have already been placedon fully completed columns of Γ. For example, if a non-zero coded bitfrom a CTBC codeword, i_(P) has been placed onto the first row, thiswould indicate a MSED of 8a² for that coded bit. However, it may benecessary to later place another coded bit of the same sequence onto thesecond row of the same column. If that happens, that 8a² contributionwould be lowered to 4a². For this reason, D_(en,temp) ²(i_(P)) is onlyupdated based upon completed columns. If any sequence, i_(P), has all ofits positions placed into Γ, then d_(s,temp)(i_(P)) and D_(en,temp)²(i_(P)) will have reached their highest values, so in such cases setd_(s)(i_(P))=d_(s,temp)(i_(P)) and D_(en) ²(i_(P))=D_(en,temp) ²(i_(P)).Note that if d_(s,temp)(i_(P))≧d_(s) and D_(en,temp) ²(i_(P))≧D_(min,n)², then any remaining position of the entry i_(P) in P(d) can be placedat any available place in Γ because that placement will not lower thetargets d_(s,t) or the D_(min) ². This is because, if the “temp” valuesare already above the target values, there is no need to consider theadditional weight or distance above the threshold target values. Ifd_(s,temp)(i_(P))<d_(s,t) and/or D_(en) ²(i_(P))<D_(min,n) ², recordd_(s,max)(i_(P)) and D_(en, maxs) ²(i_(P)). This indicates the best casenumbers for the weight d sequences that still need to be placed.

In order to systematically place the additional positions of the setP(d), identify the subset of sequences, P′(d)⊂P(d) for whichd_(s,temp)(i_(P))<d_(s,t) or D_(en,temp) ²(i_(P))<D_(min,n) ². The setP′(d) thus contains the sequences, i_(P), that still need to be placedso as to meet the target values, d_(s,t) and D_(min,n) ². For sequencesthat already have d_(s,temp)(i_(P))≧d_(s) or D_(en,temp)²(i_(P))≧D_(min,n) ², there is no need waste key positions in Γ for theadditional positions of the sequences in P(d) that have alreadysatisfied Γ's interleaver constraints. Such positions can be placedlater after all of Γ's interleaver constraints have been met.

Next construct a set E′(d) consisting of the popularity-ranked uniquepositions in P′(d), and construct a set H′(d) by removing all of thepositions in E′(d) that have already been placed into Γ. Next identify acandidate position from H′(d), starting from left to right (highestpopularity to lowest popularity) and attempt to place this candidateposition from H′(d) into the left-most column of Γ that has a vacantposition on the second row. Similar to step 2, before the placement canbe made, it should be verified that the position already occupying thefirst row of the same column is not a position associated with anysequence i_(P) in P′(d) that contains the candidate position. If thefirst row does not contain any position associated with any sequencei_(P) that contains the candidate position, the candidate position isplaced into the left most column of Γ that has a vacant position on thesecond row. If not, the process is repeated by attempting to place thecandidate position into the next left most column of Γ with a vacantposition on the second row and ensuring that the above describedconstraint is satisfied. This process is repeated until the candidateposition is placed. Once the candidate position from H′(d) is placed inΓ, for all affected sequences, i_(P), in P(d), update d_(s,temp)(i_(P)),D_(en,temp) ²(i_(P)), d_(s, max)(i_(P)) and D_(en,max) ²(i_(P)).Continue placing the remaining elements of H′(d), one at a time, untild_(s,temp)(i_(P))≧d_(s) and D_(en,temp) ²(i_(P))≧D_(min,n) ², for alli_(P) in P′(d) or until it is determined that it is impossible to do so.The above process will ensure that the elements of E′(d) will be placedin such a way that no two positions of any sequence, i_(P), in P′(d)will be placed into the same column, thereby maximizing thed_(s,temp)(i_(P)) values, and thereby achieving the highest value ofd_(s,t).

In the case where it is possible to meet these conditions for allsequences i_(P) in P′(d), it may also be the case that these conditionsare met before all of the positions in H′(d) have been placed. If thereare such additional unconstrained positions in H′(d), do not place themat this time so as leave as many vacant locations in Γ as possible forthe later placement of positions from higher weight sequences of vsubject to Γ's interleaver constraints.

On the other hand, if it was not possible to place all positions ofH′(d) to satisfy d_(s,temp)(i_(P))≧d_(s) and D_(en,temp)²(i_(P))≧D_(min,n) ² for all the sequence i_(P) in P′(d), then aroll-back can be attempted. Start by identifying positions on the firstrow (as mentioned at the end of step 2) that can be moved to the secondrow (or lower rows) without lowering the targets d_(s,t) or D_(min) ².Note that, if the values of d_(s,t) and D_(min,n) ² had to be loweredone or more times, there will have been positions placed not only on thefirst row but also on the second row (or other rows) for some sequencesthat are no longer needed to maintain the now less restrictiveinterleaver constraints, d_(temp)(i_(P))≧d_(s) and D_(en,temp)²(i_(P))≧D_(min,n) ². In order to systematically indentify the positionson rows that can be removed without violating constraints, identify thesubset P_(Q)(d≧d_(t))⊂P(d≧d_(t)) whose elements are sequences, i_(P),which have values of d_(s)(i_(P))>d_(s,t) and D_(en) ²(i_(P))>D_(min,n)² that are high enough so that at least one position of each of thesesequences can afford to be moved out of Γ to create a vacancy in Γ whilestill maintaining d_(s)(i_(P))≧d_(s) or D_(en) ²(i_(P))≧D_(min,n) ² ofall affected sequences, i_(P), in P(d≧d_(t)).

Next form a set E_(Q)(d≧d_(t)) containing a popularity-ranked(descending order of the popularity of its distinct entries inP_(Q)(d≧d_(t))) set of positions that can be removed from Γ withoutlowering the targets d_(s,t) or D_(min) ². That is, all of the positionsin the subset E_(Q)(d≧d_(t)) can be removed from Γ while stillmaintaining d_(s)(i_(P))≧d_(s) and D_(en) ²(i_(P))≧D_(min,n) ² for allsequences in P(d≧d_(t)). Note that sequences in P_(Q)(d≧d_(t)) canafford to lower their distances whereas the sequences in P′(d) have toincrease their distances. With that in mind swap positions of thesequences of P′(d) placed in H′(d) from left-to-right with the positionsof E_(Q)(d≧d_(i)) from right-to-left. By doing so, the distances of theleast number of sequences that can afford to lower their distances arelowered while the highest number of sequences that are in need ofincreasing their distances are increased. After every swap, updated_(s)(i_(P)) and D_(en) ²(i_(P)) of all affected sequences inP(d≧d_(t)), and update P′(d), P_(Q)(d≧d^(t)), and E_(Q)(d≧d_(t)).Continue this process to try to make d_(s)(i_(P))≧d_(s) and D_(en)²(i_(P))≧D_(min,n) ² for all sequences i_(P) in P(d). If all sequencesin P′(d) can be made to satisfy the constraints d_(s)(i_(P))≧d_(s) andD_(en) ²(i_(P))≧D_(min,n) ², repeat step 3 until d=d_(f). If anypotential swap would cause any CICM interleaver constraint to beviolated for any sequence i_(P), the swap is not made.

Note that during step 3, some (or all) of the positions that were movedout of Γ for later placement could be picked up by the next set (orsets) of weights d≦d_(f). Hence, it is possible to get Γ mostly (ortotally) filled by different positions before reaching d=d_(f). At thispoint, in order to guarantee the target d_(s,t) and D_(min,n) ² values,it is still necessary to keep generating coded sequences of v until wereach d=d_(f). In that process, it is possible to find sequences of vwhose positions are already almost or fully placed in Γ. If all of thepositions of a newly identified sequence i_(P) have already been fullyplaced, then their d_(s)(i_(P)) and D_(min) ²(i_(P)) can be directlycalculated. For other sequences for which the positions are partiallyplaced in Γ, d_(s, max)(i_(P)) and D_(en,max) ²(i_(P)) can becalculated. If any of d_(s)(i_(P)), D_(en) ²(i_(P)), d_(s, max)(i_(P))and D_(en,max) ²(i_(P)) values happen to fall below their correspondingtarget values (d_(s) and D_(min,n) ²), it is necessary to make changesin Γ by swapping already placed positions in it until all constraintsare met by all sequences in P(d≧d_(t)). Any violation of a constraintcan result from either category (a): the sharing of columns of Γ by thecoded bits of a given sequence i_(P), and/or category (b): positions ofv that are mostly placed on the second row that make lower contributionsto D_(en) ²(i_(P)). As discussed below, the additional constraints canbe enforced during the placement of positions in Γ to avoid the sharingof columns of Γ by the coded bits of a given sequence i_(p).

In fact, if all the additional conditions such as inter-columnconditions and inter-sequence constraints as described below can befully satisfied during the placement of positions in Γ, violationscaused by sequences that fall under category (a) can be completelyeliminated. In situations where all inter-column conditions andinter-sequence constraints as described below cannot be fully satisfiedand the case of category (a) occurs for any given sequence i_(p), thenit becomes necessary to move some of the positions, preferably startingfrom the positions of the sequence i_(p) that share the same columns inΓ. In any such sequence i_(p), it is first desirable (may even besufficient) to move positions that share the same columns of Γ. Thischange can be done by swapping with positions on the same row. Forexample, if such a sequence i_(p) currently has four positions of it intwo columns of Γ, a position from one of the rows from each column canbe swapped with a different position on the same row. In the selectionof the row of the selected column, it is preferable to select the rowthat contains the position that has the lower popularity to minimize thenumber of affected sequences. Such a swap increases the d_(s)(i_(P)) andD_(en) ²(i_(P)) values of that sequence. Further, it is desirable tofind a position that can be swapped without lowering d_(s)(i_(P)) andD_(en) ²(i_(P)) of any other sequence including the sequences thatcontain the position selected for the swap on the same row. It is alsodesirable to select a position on the same row that is contained by onlysequences on P(d≧d_(t)) that barely satisfy the two constraints. This isbecause if only such a sequence(s) is involved it is not reallynecessary to use up the other sequences that can afford to lower theird_(s)(i) and D_(en,min) ² values on these swaps, and instead, it isbetter to save them to form E_(Q)(d≧d_(t)). Any sequence that satisfiesthe two constraints and does not qualify to feed positions toE_(Q)(d≧d_(t)) can be considered as such a sequence that barelysatisfies the constraints. Hence, it is helpful to form a set,Ē_(Q)(d≧d_(t)), for each row separately that list the positions on therespective row, in the decreasing rank popularity order, that are not inthe set E_(Q)(d≧d_(t)) and can thus be used for the swaps. In order tosystematically handle sequences that fall under category (a), (i)identify the set of all sequences that do not satisfy one or bothconstraints and come under category (a), (ii) identify the set ofdistinct columns that are shared by the sequences found in (i), (iii)identify each position on the least popular row of each of the selectedcolumns, and (iv) find the least popular position from Ē_(Q)(d≧d_(t)) ofthe same row that can be used for the swap with each selected positionin step (iii). After every swap, re-calculate d_(s)(i_(P)), D_(en)²(i_(P)) of all completed sequences, and d_(s, max)(i_(P)) andD_(en,max) ²(i_(P)) of all partially completed sequences to identify thesequences that still do not still satisfy any of the two constraints. Ifthe above steps (i) through can be successfully completed, alld_(s)(i_(P)) and d_(s, max)(i_(p)) values will be guaranteed to satisfythe condition d_(s)(i_(P))≧d_(s,t). However, some sequences may stillneed to increase their D_(en) ²(i_(P)) values. For those sequences, itis necessary to swap selected positions of them on the second row withpositions on the first row as mentioned in category (b). The sequencesunder the category (b) mentioned above can be handled by using the sameapproach used to find places in Γ for positions in H′(d) using the setE_(Q)(d≧d_(t)). Instead of H′(d), the set of distinct positions of allthe sequences i_(P) that fall under category (b) in their descendingrank popularity order can be used instead.

However, at any value of d, if step 3 fails to make all sequences ofP′(d) to satisfy the constraints, then:

(a) lower d_(s,t) and/or D_(min) ² and recalculate all needed parameterssuch as d_(f) as discussed above for these lower values.

(b) repeat step 3 with these new lower values. In failing,

(c) go back to step 1 with these lower values of d_(s,t) and D_(min) ².If that also fails,

(d) repeat (a)-(d) until all sequences i_(P) in P(d≧d_(t)) can satisfyd_(s)(i_(P))≧d_(s) and D_(en) ²(i_(P))≧D_(min,m) ².

Once all of the interleaver constraints have been met for alld_(t)≦d≦d_(f) as described above, any and all of the remainingpositions, i=0, . . . K−1 that have not already been placed into Γ canbe placed anywhere in Γ without violating the interleaver constraints.The values of d_(s,t) and D_(min,n) ² at the point of stopping are thevalues that can be finally reached.

It is important to note here that throughout the above discussion wehave used one d_(s,t) and one D_(min) ² value for all sequences of v.However, because v is generated from a concatenated code that achievesdifferent interleaver gains for different sequences, it can be desirableto employ different d_(s)(i_(P)) and D_(min) ²(i_(p)) values fordifferent categories of sequences of v(i_(p)) as described above inconnection with the CI-3 and CI-4 design algorithms. Since allcalculations that are used to design Γ employ d_(s)(i_(P)) and D_(min)²(i_(p)) values individually on sequences, the above method can bedirectly used for varying sets of d_(s,t) and D_(min) ² values fordifferent sequences if desired. Since the higher weight sequences v ofthe concatenation usually achieve higher interleaver gains, even thoughit is necessary to consider all weights up to d_(f), it may besufficient to only consider weights up to a weights less than d_(f) toachieve good performance for selected CTBC codes. Hence it is to beunderstood that in any of the algorithms and examples presented herein,the interleaver constraints can be modified to employ differentd_(s)(i_(P)) and D_(min) ²(i_(p)) constraint thresholds depending uponthe category any given sequence v(i_(P)) belongs.

A. Inter-Column Constraints:

In order to understand the impact of the constellation-mapping of bitsonto symbols has on D_(min) ², consider a given sequence i_(P) listed inP(d). Let x(i_(P)) be the number of columns of Γ that contain oneposition from the sequence i_(P) on first row and zero positions ofi_(P) on the second row. Let y(i_(P)) be the number of columns of Γ thatcontain zero positions from the sequence i_(P) on the first row and oneposition from the sequence i_(P) on the second row. Let z(i_(P)) be thenumber of columns of Γ that contain two positions from the sequencei_(P), one on the first row and anther on the second row. With thesedefinitions and the constellation mapping rule of FIG. 15, the resultingnormalized squared Euclidean distance of the sequence i_(P) will begiven by

D _(en) ²(i _(P))=(2x(i _(P))+y(i _(P))+z(i _(P))).  (11)

Further, since the sequence i_(P), is taken from P(d) and thus hasweight d=d(i_(P)), the parameters x(i_(P)), y(i_(P)) and z(i_(P)) willnecessarily satisfy

x(i _(P))+y(i _(P))+2z(i _(P))=d(i _(P))  (12)

where d(i_(P)) is the weight of the sequence i_(P). It follows from (11)and (12), that

D _(en) ²(i _(P))=[(x(i _(P))−z(i _(P)))+d(i _(P))]  (13)

and d_(s)(i_(P))=x(i_(P))+y(i_(P))+z(i_(P)). Further, from equations(11) through (13) it follows that for any pre-selected pair of valuesd_(s,t) and D_(min,n) ², if the sequence i_(P) satisfies bothconstraints, then x(i_(P)) and z(i_(P)) must satisfy,

z(i _(P))≦(d(i _(P))−d _(s,t)),  (14a)

and

x(i _(P))≧[D _(min,n) ² −d(i _(P))+(i _(P))].  (14b)

Also, it follows from (14) that the maximum allowable value of z(i_(P)),z_(max)(i_(P)), and the minimum required value of x(i_(P)),x_(min)(i_(P)), can be computed as

z _(max)(i _(P))=(d(i _(P))−d _(s))  (15a)

and

x _(min)(i _(P))=[D _(min,n) ² −d(i _(P))+z(i _(P))].  (15b)

As can be seen from equations (14) and (15), it is desirable to have alow value of z(i_(P)) (like z(i_(P))=0). This is because a lower valueof z(i_(P)) can increase the value of d_(s,t) and an decrease therequired value of x(i_(P)). However, when placing the positions of anysequence i_(P) into Γ, a potential current lack of available locationsin Γ may give rise to the requirement that z(i_(P))>0. Hence, as each ofthe main algorithmic steps 1 through 3 as described above are executed,for each sequence i_(P) in the table P(d≧d_(t)), it is desirable tocompute and record x_(min)(i_(P)) and z_(max)(i_(P)) and to then usethese values to guide the placement of the positions of each sequencei_(P) into Γ.

When the CICM mapping rule design algorithm begins to execute, or ateach new pass through step 3, for each identified sequence i_(P) inP(d), z(i_(P)) is initialized to z(i_(P))=0, and the starting value ofx_(min)(i_(P)) is computed from equation (15b). However, as eachadditional position from the set E′(d) or H′(d) is placed into Γ, thevalues of z(i_(P)) for all affected sequences, i_(P), may need to beincreased, and at such time, any such affected value of x_(min)(i_(P))is then updated in accordance with equation (15b). In order to monitorthe progress of the values of x(i_(P)) and z(i_(P)), as each one of thepositions from the set E′(d) or H′(d) is placed into Γ, for all affectedsequences, i_(P), first update x_(min)(i_(P)) and z_(max)(i_(P)) usingequation (15), and additionally monitor the current x(i_(P)) value,x_(temp)(i_(P)), and the current z(i_(P)) value, z_(temp)(i_(P)).Initially, before any of the positions of any such sequence i_(P) havebeen placed, initialize x_(temp)(i_(P))=z_(temp)(i_(P))=0. Note that allof these parameter updates are computed only when both rows of a columnin Γ are filled. This is because the above parameters are only fixed(not subject to future change) after the both of the locations column inΓ have been filled.

In order to satisfy the inter-column constraints, the goal is to achievex_(temp)(i_(P))≧x_(min)(i_(P)) and z_(temp)≦z_(max) when all positionsof any sequence, i_(P), have been placed, thereby satisfying (14a) and(14b). If this cannot be done for all sequences, i_(P), all the way upto weight d_(f), then a roll-back can be attempted and steps 1 through 3of the CICM mapping rule design algorithm can be executed again to seeif a valid Γ can be found. If the roll-back attempt fails, then lowerd_(s,t) and/or D_(min,n) ² as discussed above, and keep executing theCICM mapping rule design algorithm until equation (14) can be satisfiedfor all sequences, i_(P), in P(d≦d_(f)).

B. Inter-Sequence Constraints:

The above steps 1 through 3 of the CICM mapping rule design algorithmimplement constraints and perform processing based upon each of theindividual coded sequences, i_(P), in P(d), for d=d_(t), . . . , d_(f).In addition to considering single sequences, additional inter-sequenceconstraints are needed to avoid conditions that can arise where multipledifferent sequences interact to cause the D_(min) ² value of theconstellation to decrease. There is no need to implement inter-sequenceconstraints to ensure that the target symbol Hamming distance ismaintained at d_(s,t) because the target symbol Hamming distance is onlyaffected by the placement of the positions in each sequence, i_(P), whenconsidered alone.

To understand the inter-sequence constraints that ensure that D_(min,n)² is not lowered due to combinations of sequences, consider a specificexample involving a weight d_(t) sequence, i_(P1) from the tableP(d_(t)). In this example it is desired to maintain D_(min,n) ²=1.5d_(t)by placing x(i_(P1))=d_(t)/2 positions on the first row, y(i_(P1))=d_(t)positions on the second row, and to have z(i_(P1))=0. Next consider asecond weight d_(t) sequence, i_(P2), also from the table P(d_(t)). Inthis example, d_(t)/2 positions of the second sequence i_(P2) are alsoplaced on the first row, and in the same columns where d_(t)/2 of thepositions of the sequence i_(P1) have been placed on the second row.Also, d_(t)/2 of the positions of i_(P2) are placed on the second row inthe same columns where d_(t)/2 positions of the sequence i_(P1) havebeen placed on the first row. In the end the combination of the twosequences still maintain d_(s,t)=d_(t) but D_(min,n) ² for thecombination is now lowered to d_(t) because the two sequences incombination generate d_(t) number of “11” QPSK symbols. The easiest wayto prevent these types of undesirable inter-sequence interactions is tolimit the number of columns any two different sequences can share. Inthe above mentioned example, sequences i_(P1) and i_(P2) are allowed toshare d_(t) positions. However, such cases can be prevented by imposingan inter-sequence constraint to limit the number of columns that certainpotentially troublesome pairs of sequences, i_(P1) and i_(P2), canshare.

Specifically, the potentially troublesome pairs of sequences, i_(P1) andi_(P2), are called “disjoint sequences” herein. To understand what apair of disjoint sequences is, first note that the sequences i_(P1) andi_(P2), are each low weight sequences as listed in the table P(d) orP(d≧d_(t)) and thus have respective weights d(i_(P1)) and d(i_(P2)).Next note that each of the sequences, i_(P1) and i_(P2), are associatedwith CTBC codewords, v(i_(P1)) and v(i_(P2)). Also, due the way the CTBCcoded v-sequences are formed, i.e., v=C[u]=G[π[c]], each of thepositions vectors, i_(P1) and i_(P2), are also associated with two OBCcoded sequences, c(i_(P1)) and c(i_(P2)). Recall that each sequence ccan be viewed as a naturally ordered set of OBC codewords positions,{c_(j)}, for j=0, 1, 2, . . . , ρ−1. Therefore, two sequences, i_(P1)and i_(P2), are said to be disjoint sequences if the nonzero codewordpositions in the vectors c(i_(P1)) and c(i_(P2)) are disjoint, i.e.,non-overlapping. Inter sequence constraints are developed below tospecifically eliminate potential ill effects due to higher weight CTBCcodewords, v, that include at least two disjoint low weight vectorsc(i_(P1)) and c(i_(P2)), that correspond to sequences, i_(P1) andi_(P2), in the table P(d≧d_(t)).

In order to identify disjoint sequences at any weight d, start byperforming an analysis run of the CI-4 design algorithm in order toconstruct the table, P(d). Next, for each positions vector, i_(P) inP(d), identify the corresponding vectors v and c, such that v=G[π[c]].Next, for each sequence i_(P1) in the table P(d≧d_(t)), identify acorresponding set of disjoint sequences Δ_(dis)(i_(P1)), where if i_(P2)is disjoint relative to i_(P1) (i.e., there is no overlap in c(i_(P1))and c(i_(P2))), then i_(P2) is included as a member of Δ_(dis)(i_(P1)).Each time step 3 is entered so that the weight d is incremented, ifi_(P), was already in the set {P(d_(t)), . . . , P(d−1)} then the setΔ_(dis)(i_(P)) need only be updated by adding any of the weight dsequences in P(d) that are disjoint to i_(P). Also, by observing theassociated vectors c(i_(P1)), c(i_(P2)), and c(i_(P3)), it can bereadily seen that if i_(P2) and i_(P3) are two entries ofΔ_(dis)(i_(P1)), then even though i_(P2) and i_(P3) are disjointrelative to i_(P1), i_(P2) and i_(P3) are not necessarily disjointsequences relative to each other.

Define the quantity, sh(i_(P1),i_(P2)), to be the highest number ofcolumns that any two selected sequences i_(P1) and i_(P2) can sharewithout causing MED to be lowered below D_(min) ². Recall that theweight of the sequence i_(P1) is d(i_(P1)) and the weight of thesequence i_(P2) is d(i_(P2)). Then for the two sequences achieve an MEDof at least D_(min) ², the following inequality must be satisfied,

[x(i _(P1))+x(i _(P2))]−[z(i _(P1))+z(i _(P2))]+[d(i _(P1))+d(i_(P2))]−2sh(i _(P1) ,i _(P2))≧D _(min,n) ²  (16)

and hence,

sh(i _(P1) ,i _(P2))≦└[(x(i _(P1))+x(i _(P2)))−(z(i _(P1))+z(i_(P2)))+(d _(iP1) +d _(jP2))−D _(min,n) ²]/2┘  (17)

To better understand the inequality (16), note that each of the twosequences i_(P1) and i_(P2) each individually satisfies equation (12) attheir respective weights, d(i_(P1)) and d(i_(P2)). Also, the combinationof the two sequences form (x(i_(P1))+x(i_(P2))) columns with a one onthe first row and a zero on the second, (y(i_(P1))+y(i_(P2))) columnswith a zero on the first row and a one on the second,(z(i_(P1))+z(i_(P2))+sh(i_(P1),i_(P2))) columns with a one on both rows.The sum of squared Euclidean distance contributions from all rows is atleast D_(min) ².

Since the two sequences can have only (x(i_(P1))+x(i_(P2))) number offree ones on the first row, and y(i_(P1))+y(i_(P2)) number of free oneson the second row, the maximum number of columns the combination canform with ones on both rows, i.e., sh(i_(P1),i_(P2)), is furtherrestricted by

sh(i _(P1) ,i _(P2))≦min{(x(i _(P1))+x(i _(P2))),(y(i _(P1))+y(i_(P2)))}.  (18)

However, the sh(i_(P1),i_(P2)) values can be calculated after allvalues, x(i_(P1)), x(i_(P2)), z(i_(P1)) and z(i_(P2)) are known, andthese values only become available after all the positions of i_(P1) andi_(P2) are placed. Hence, the following temporary values are calculated

x _(t)(i _(P1))=max{x _(temp)(i _(P1)),x _(min)(i _(P1))}  (19a)

z _(t)(i _(P1))=min{z _(temp)(i _(P1)),z _(max)(i _(P1))}  (19b)

x _(t)(i _(P2))=max{x _(temp)(i _(P2)),x _(min)(i _(P2))}  (19c)

z _(t)(i _(P2))=min{z _(temp)(i _(P2)),z _(max)(i _(P2))}  (19d)

to use in place of the respective values, x(i_(P1)), x(i_(P2)),y(i_(P1)) and y(i_(P2)) in equation (17). Note that for any two disjointsequences i_(P1) and i_(P2), it is also possible to use the parametersof equation (9) to calculate a temporary sh(i_(P1),i_(P2)) value,sh(i_(P1),i_(P2))_(temp), that indicates the highest allowable number ofcolumns the two sequences can share based on the currently availableinformation. Once the positions are placed in Γ, thesesh(i_(P1),i_(P2))_(temp) values can be updated to ensure that in the endequation (17) is satisfied.

Further, it is noted that if D_(min,n) ²≦d_(t), inter-sequenceconstraints are not needed. This is because two sequences cannotinteract to lower D_(min,n) ² below d_(t). This is because for any twosequences, each with weight d, the worst case for D_(min,n) ² is to haveboth the sequences completely aligned in d columns, in which case,D_(en) ²(i_(P))=d. Similarly, if d_(t)<D_(min,n) ²≦1.5d^(t), equation(17) must be satisfied by every pair of disjoint sequences, but no morethan pairs of sequences need be considered. This is because more thantwo sequences cannot interact to lower D_(min,n) ² below 1.5d_(t).

However, if 1.5d_(t)<D_(min,n) ²≦2d^(t), it should be ensured thatequation (17) is satisfied for all pairs, and in addition, it should beensured that no combinations of three mutually disjoint sequencesachieve a normalized squared MED below D_(min,n) ². A set of sequencesare called mutually disjoint if every two sequences of that set aredisjoint. Consider three mutually disjoint sequences v(i_(P1)),v(i_(P2)) and v(i_(P3)) with respective weights d(i_(P1)), d(i_(P2)) andd(i_(P3)) and with their respective parameters, x(i_(P1)) & z(i_(P1)),x(i_(P2)) & z(i_(P2)), and x(i_(P3)) & Z(i_(P3)). Following the samelogic of equation (17), equation can be extended to three mutuallydisjoint sequences as

$\begin{matrix}{\left( {{{sh}\left( {i_{P\; 1},i_{P\; 2}} \right)} + {{sh}_{ik}\left( {i_{P\; 1},i_{P\; 3}} \right)} + {{sh}_{jk}\left( {i_{P\; 1},i_{P\; 3}} \right)}} \right) \leq \left\lfloor \frac{\left\lbrack {{\sum\limits_{k = 1}^{3}\left\lbrack {{x\left( i_{Pk} \right)} - {z\left( i_{Pk} \right)} + {d\left( i_{Pk} \right)}} \right\rbrack} - D_{{m\; i\; n},n}^{2}} \right\rbrack}{2} \right\rfloor} & (20)\end{matrix}$

Hence, when updating any sh(i_(P1),i_(P2)) value, it is not onlynecessary to ensure that it satisfies equation (17) but also it isnecessary that it satisfies (20) based on the already available valuesof sh(i_(P2),i_(P3)) and sh(i_(P1),i_(P3)). Since the highest achievableD_(min,n) ² of the QPSK constellation in FIG. 15 considered in thisexample is 2d_(t), it is only necessary to watch for combinations of upto three disjoint sequences which are taken care of by (17) and (20).However, in constellations with more constellation points, combinationsof multiple disjoint sequences need to be checked by equations similarto (20).

In order to reduce the complexity of implementing inter-sequenceconstraints, Method 1 and Method 2 are presented. Under broadconditions, Methods 1 and 2 can reduce or altogether avoid the need tocheck equation (20) and similar equations that deal with combinations ofmore than three disjoint sequences.

Method 1:

This method imposes stronger restrictions on pairs of sequences given by(1 7) so that multiple combinations automatically satisfy the MEDcondition. For example, if the division by 2 in (1 7) is changed to adivision by 4, equation (20) will be guaranteed to be always satisfied.

In order to better understand why, consider a combination of weight dmutually disjoint sequences in an application, similar to the 1 6-QAMexample discussed later, that maintains the same Euclidean distance forall one bit, 2-bit, and so on up to m-bit differentials on theconstellation. Let us also consider the case, when x₁(i_(P))=d(i_(P)),i.e., all coded bits of every weight d sequence i_(P1) is placed ind(i_(P1))=d different columns and on the first row whose associateddistance is D₁ ². Then each such sequence i_(P) individually achieves anormalized squared Euclidean distance of D_(en) ²(i_(P))=dD₁ ². Nextconsider the case that any two disjoint sequences, i_(P1) and i_(P2),both of weight d(i_(P1))=d(i_(P2))=d and these two sequences can onlyshare one column, i.e., sh(i_(P1),i_(P2))=1. Hence, in this case thehighest possible minimum squared Euclidean distance achieved by any twosequences i_(P1) and i_(P2), is D_(en) ²(i_(P1),i_(P2))=2(d−1)D₁ ²+D₂ ²(where D₂ ² is the distance associated with the a two bit differentialon the constellation). Similarly, the highest possible minimum squaredEuclidean distance achieved by any three weight d sequences i_(P1),i_(P2), and i_(P3), is D_(en) ²(i_(P1),i_(P2),i_(P3))=3(d−2)D₁ ²+3D₂ ².Hence, the minimum squared Euclidean distance achieved by the worst caseof (d+1) sequences i_(P1), i_(P2), i_(P3), . . . , i_(P(d+1)) is D_(en)²(i_(P1),i_(P2),i_(P3), . . . , i_(P(d+1)))=[d(d−1)/2]D₂ ². Depending onthe values of d, D₁ ² and D₂ ², [d(d−1)/2]D₂ ² can be larger than dD₁ ².Note that if sh(i_(P1),i_(P2))>1, fewer than (d+1) number of disjointsequences can result in a squared Euclidean distance that is dependentonly on D₂ ². Hence, depending on d, D₁ ², and D₂ ², the highestpossible value of sh(i_(P1),i_(P2)) between any two sequences i_(P1) andi_(P2) can be chosen to make the squared Euclidean distance of anycombination of disjoint sequences to be larger than dD₁ ².

Therefore, by choosing sh(i_(P1),i_(P2)) values to ensure D_(en)²≧D_(min,n) ², it is possible to guarantee that any combination ofdisjoint sequences is guaranteed to generate the desired MED. WhenMethod 1 is compared to equation (17), equation (17) allows twosequences i_(P1), and i_(P2) to share more columns. Note that equation(20) must then also be satisfied separately, and based upon the numbercolumns shared by sequences i_(P1), & i_(P2) and i_(P2), & i_(P3),equation (20) determines how many columns i_(P1), & i_(P3) can beallowed to share. This suggests that it is always desirable to limit thenumber of columns any two disjoint sequences can share even if itimposes additional restrictions on finding locations in Γ to place thepositions contained in E(d≧d_(t)).

Method 2: Method 2 is based on the fact that equations (17) and (20)apply only to pairs of disjoint sequences. Hence if each column of Γ isconstrained hold a subset of positions whose associated subset ofsequences, {i_(P1)}, do not include any disjoint sequences, this wouldavoid all of these problematic situations completely. Hence, forexample, when placing a position into a candidate location on the secondrow of a column of Γ, first identify all of the sequences in P(d≧d_(t)),{i_(P1)}, that contain the position already placed into in the first rowabove the candidate location. Then identify all of the Δ_(dis)(i^(P))sets corresponding to all of the sets of sequences {i_(P)} that containthe position i_(P1) in the row above, and find a position to load intothe candidate location that is not a member of any of these identifiedsets, {Δ_(dis)(i_(P1))}.

In practice, the following sequence can be used. First attempt to placea position into the current candidate location using Method 2. Infailing, it will be necessary to place a position of a disjointsequences relative to a position already placed in the same column abovein Γ. Next use Method 1 with the smallest possible sh(i_(P1),i_(P2))values as discussed in Method 1. In failing increase sh(i_(P1),i_(P2))values up to the levels as required to meet equation (17) and then makesure that (20) is also satisfied.

When the inter-sequence constraints are used as in any of the waysdiscussed above, it is desirable to create a second table P⁺(d≧d_(t))along with P(d≧d_(t)) to list the linear combinations of the distinctcoded sequences. The table P⁺(d≧d_(t)) can be derived from all theΔ_(dis)(i_(P)) entries. If i_(P2) is an entry of Δ_(dis)(i_(P1)) thenthe modulo 2 addition of the v(i_(P1)) and v(i_(P2)) are listed into toP⁺(d≧d_(t)). Before adding a new weight d sequence i_(P) into P(d),check to see if i_(P) corresponds to any entry of P⁺(d≧d_(t)). If so, itis not necessary to add that sequence to P(d).

In accordance with the definition of d_(e), in order to maintain thenormalized squared MED at D_(min,n) ², it is necessary to ensure that nocoded sequence of weight up to d_(e)=2D_(min,n) ² can generate anormalized squared MED less than D_(min,n) ². Hence, even if theinter-sequence constraints are ignored in the design stage, allproblematic cases will be found in that search. However, it is highlydesirable to impose the inter-sequence constraints during the design.This is because otherwise once troublesome combinations of disjointsequences that reduce D_(min,n) ² are found, it is necessary to swappositions when all positions are placed and it becomes harder to keepdoing it for a larger number of cases. Instead if the inter-sequenceconstraints are imposed it is not even necessary to check for anycombination of already checked disjoint sequences as they are guaranteedto satisfy the MED requirement. Hence, one good way to implement thisprocedure is to create a second Table P⁺(d≧d_(t)) along with P(d≧d_(t))to list the linear combinations of the disjoint coded sequences. The setP⁺(d≧d_(t)) can be derived from all the Δ_(dis)(i_(P1)) entries. Ifi_(P2) is an entry of Δ_(dis)(i_(P1)) then the modulo 2 addition ofv(i_(P1)) and v(i_(P2)) are added into the table P⁺(d≧d_(t)). Hence,when a new candidate sequence is identified for inclusion intoP(d≧d^(t)), before entering it into P(d≧d_(t)), check to see if it is anentry of P⁺(d≧d_(t)). If so, do not add this new candidate sequence intoP(d≧d_(t)). This way, P(d≧d_(t)) can be made shorter and checks forhigher weights can be made easier.

CICM 16-QAM Constellation Example:

With the knowledge gained from the previous example, we now consider thesystematic construction of Γ with a 16-QAM constellation. First aconstellation mapping rule is chosen to maximize the Euclidean distancefor single bit separations, and decreasing Euclidean distances forhigher bit separations. FIG. 16 shows a 16-QAM constellation that usesthis constellation mapping policy that attempts to maintain the sameEuclidean distance for all single bit differences. Even though thecorner constellation points have different Euclidean distances fordifferent one bit differences (due to the nature of the 16-QAMconstellation), the constellation points A, B, C and D in FIG. 16 doachieve the same highest possible squared Euclidean distance, which isdenoted as D₁ ²=20a². This same squared Euclidean distance is maintainedas the minimum squared Euclidean distance for all one bit differencesrelative to every constellation point. Hence, in this 16-QAM example, incontrast to the previous QPSK example, each row of Γ will have a similarMED contribution, and thus the higher rows in Γ will not be favored overthe lower rows. Therefore, in this 16-QAM example, the CICM mapping ruleseeks to place the positions of each sequences i_(P) in separate columnswithout concern for the rows. As a result, the highest achievable MED ofthis 16-QAM example is D₁ ²d_(t) where D₁ ²=20a² and the highestachievable minimum symbol Hamming distance is d_(s,max)=d_(t). Thecorresponding highest achievable normalized MED is D_(min,n) ²=D_(min)²/4a²=5d_(t).

In order to establish the mathematical background for analysis in ageneral manner, let us consider the placement of the positionsassociated with a weight d sequence i_(P) into Γ. In general, consider aM=2_(m)-ary constellation that has minimum Euclidean distance D_(b) forevery b-bit separation, b=1, 2, . . . , m. Now let us say that thisweight d sequence i_(P) is placed in Γ such that it occupies x_(b)columns with weight b, b=1, 2, . . . , m. For example, when d=d_(t), itis desired to have x₁=d_(t) and all other x_(b)=0 to achieved_(s,max)=d_(t) and D_(min,max)d_(i)D₁ ². However, for a generaldistribution of x_(b)'s, the resulting symbol Hamming distance,d_(s)(i_(P)) and the squared normalized Euclidean distance, D_(en)²(i_(P)), of this general weight d(i_(P)) sequence i_(P) can be writtenas

$\begin{matrix}{{D_{en}^{2}\left( i_{P} \right)} = {\sum\limits_{b = 1}^{m}{{x_{b}\left( i_{P} \right)}D_{b}^{2}}}} & (21) \\{{d_{s}\left( i_{P} \right)} = {\sum\limits_{b = 1}^{m}{x_{b}\left( i_{P} \right)}}} & (22)\end{matrix}$

subject to the constraint imposed by its weight d(i_(P))

$\begin{matrix}{{\sum\limits_{b = 1}^{m}{{bx}_{b}\left( i_{P} \right)}} = {{d\left( i_{P} \right)}.}} & (23)\end{matrix}$

In order to ensure the target symbol Hamming distance d_(s,t), is alwaysmet, it needs to be ensured that no sequence i_(P) of weight up tod_(w)=m(d_(st)−1) can create a symbol Hamming weight, d_(s)(i_(P)), lessthan d_(s,t). Similarly, in order to ensure a normalized squaredEuclidean distance of D_(min,n) ² no sequence i_(P) of weight up tod_(e)=└mD_(min) ²/4a²┘ can generate a D_(en) ²(i_(p)) value less thanD_(min,n) ². Again, d_(f)=max{d,d_(e)} is used to identify the highestweight of sequences i_(P) that need to be eventually included intoP(d≧d_(t)) to in order to ensure both the selected target d_(s,t) andD_(min,n) ² values are achieved.

Referring again to the 16-QAM constellation with the constellationmapping rule shown in FIG. 16, it can be seen that D₁ ²=20a², D₂ ²=8a²,D₃ ²=4a² and D₄ ²=32a². Hence, for any sequence i_(P), it follows from(21) that D_(en) ²(i_(P)) can be bounded by

D _(en) ²(i _(P))≧x ₁(i _(P))D ₁ ² +x _(a)(i _(P))D _(a) ²  (24)

where D_(a) ²=min{D_(j) ²}, j=1, 2, . . . m, and j=j_(m) minimizes overj, i.e., Da²=D_(jm) ²

x _(a)(i _(P))={(d(i _(P))−x ₁(i _(P)))┌(d(i _(p))−x ₁(i _(P)))/j_(m)┐}.  (25)

The bound in (24) derived from (21) observes the following facts: (a)the highest contribution of (21) comes from the first term of it, (b)there are at least x_(a)(i_(P)) number of additional columns occupied byany sequence i_(P), and (c) each of these x_(a)(i_(P)) additionalcolumns contributes at least D_(m) ²(i_(P)) to D_(en) ²(i_(P)).

By considering D_(en) ²(i_(P)) in terms of its bound in (24), the designof Γ in the 16-QAM constellation can be easily related to that of theQPSK constellation in FIG. 15. The two constellations can be related byconsidering x₁(i_(P)) and x_(a)(i_(P)) values of the 16-QAM equivalentto x(i_(P)) and z(i_(P)) in the QPSK example. Hence, in one way, the16-QAM constellation becomes a little simpler because there is no needto consider a y(i_(P)) parameter for every sequence, but however, now itis necessary to consider m=4 rows (as opposed to two in QPSK) of Γ.Other than these minor differences, a similar method of constructing Γcan be applied in this second example involving the 16-QAM constellationof FIG. 16.

The algorithms starts off, similar to the QPSK example, by attempting toachieve d_(s,t)=d_(t) and D_(min,n) ²=D₁ ²d^(t), and then, if necessary,these targets are lowered gradually until a solution, Γ, is found. Thesteps can be summarized as:

Step 1. Find all weight d_(t) sequences i_(P) of the concatenation andload their positions vectors into P(d_(t)). The information of everysequence i_(P) includes, x_(1,min)(i_(P)) and x_(a,max)(i_(P)) (likex_(min)(i_(P)) and z_(max)(i_(P)) of the QPSK example), and c(i_(P)).Once all weight d sequences are loaded on to P(d_(t)), find (or update)Δ_(dis)(i_(P)) for each sequences i_(P) in P(d_(t)) and form the setP⁺(d_(t)) using the modulo 2 addition of the disjoint sequence inP(d_(t)) as previously discussed. Check every new candidate sequence,i_(P), to see if it is already in P⁺(d_(t)), and if this sequencealready appears in P⁺(d_(t)), do not enter it in P(d_(t)).

2. Identify E(d_(t)) the set of distinct positions of weight d_(t)sequences. Rearrange E(d_(t)) in the order of descending popularity ofthe positions among the sequences on P(d_(t)). Place as many positionsof E(d_(t)) as possible on the first row of Γ. If E(d_(t)) has anyremaining positions, in contrast to the QPSK case, this alone does notmean that D_(min) ² needs to be lowered. Place the remaining positionsof E(d_(t)) (set H) by filling in columns one at a time. Try to fill inthe left most possible columns first, and in failing move to the right.Try to place the most popular positions first and try to maintaind_(s,t) and maintain x_(1,min)(i_(P)) for all sequences. In addition, byusing Δ_(dis)(i_(P)) and following the analysis in (17)-(20), andMethods 1 and 2, make sure that all the inter-sequence constraints aresatisfied. If it is necessary to have disjoint sequences share columnstry to avoid the situation where mutually disjoint sequences share anyof the same columns. If all positions of E(d_(t)) cannot be placed in Γto satisfy the above conditions, first try swapping positions. If thatfails too, lower d_(s,t) and/or D_(min,n) ² until all positions ofE(d_(t)) can be placed in Γ while meeting the CICM interleaverconstraints for d_(s,t) and

3. Set d=d+1. For every sequence i_(P) in every set of sequencesP(d≧d_(t)) record d_(s,temp)(i_(P)) and D_(en) ²(i_(P)). As described instep 3 of the QPSK case identify sets P(d≧d_(t)), P′(d≧d_(t)),P⁺(d≧d_(t)), E′(d), H′(d), E_(Q)(d≧d_(t)), and E_(Q)(d≧d_(t)). Place thepositions of Ē_(Q)(d≧d_(t)), H′(d) similarly as described in step 3 ofthe QPSK example, but without favoring the first row, and in order tobest meet the d_(s,t) and D_(min,n) ² constraints, while also satisfyingthe inter-sequence constraints. If needed, swap positions until allsequences can maintain d_(s)(i_(P))≧d_(s) and D_(en) ²(i_(P))≧D_(min,n)². If d<d_(f), repeat step 3.

It is interesting to compare the performance of the QPSK and 16-QAMexamples with the same concatenated code with BPSK transmission. Forthat comparison, we use the standard normalized squared Euclideandistance d_(min) ²=D_(min) ²/2E_(b,avg) that considers the average bitenergy E_(b,avg), and observe that E_(b,avg) in the QPSK and the 16-QAMschemes are respectively, E_(b,avg,QPSK)=a² and E_(b,avg,16-QAM)=5a²/2.Hence, the highest achievable d_(min) ² for the QPSK and 16-QAM schemes(assuming that the interleaver Γ can be designed to achieve the highestpossible D_(min) ²) are d_(min,QPSK) ²=4Rd_(t) and d_(min,16-QAM)²=4Rd_(t) respectively, where R is the rate of the CTBC code. Note thatwith BPSK signaling (or QPSK with standard Gray mapping) d_(min,BPSK)²=2Rd_(t). Hence, the CICM design of the QPSK is clearly better than theusual QPSK that uses Gray mapping. Interestingly, even the CICM-16-QAMwhich transmits 4 bits per interval has a higher value of d_(min) ² thanthe standard QPSK with Gray mapping.

CICM Higher Order PSK Example:

The next example shows demonstrates how to determine a CICM mappingpolicy using PSK constellations. Similar to set partitioning inUngerbock's TCM, it is shown how to systematically expand a M=2^(m)point PSK constellation to form a 2M=2_(m+1) point PSK constellation.With the CICM mapping rule, the MSED of the constellation at thesequence level does not reduce each time the constellation size isdoubled.

To begin, consider the construction of a reverse Gray coded 8-ary PSKconstellation whose phase angles are in their standard positions, {0,±π/4, ±π/2, ±3 π/4, π}, as shown in FIG. 17. This 8-ary PSKconstellation can be viewed as being composed of the constellationpoints from the 4-ary PSK (QPSK) constellation as shown in FIG. 15, plusa second copy of the 4-ary constellation shown in FIG. 15 rotatedclockwise by 135 degrees. The coding of the LSBs of the resulting 8-aryPSK constellation points in FIG. 17 come directly from these two copiesof FIG. 15, while the MSB of the constellation points in FIG. 17 is setto “0” for the original points from FIG. 15, and the MSB is set to “1”for the copy in FIG. 15 that was rotated clockwise by 135 degrees. Theresulting constellation in FIG. 17 still maintains the same Euclideandistance D₁ ² for all one bit differences as the 4-PSK constellation ofFIG. 15. Hence, a CTBC code and a properly designed Γ with m=3 alongwith the constellation shown in FIG. 17 is capable of achieving amaximum squared Euclidean distance of D_(min) ²=4Ed_(t) (the same as inthe 4-PSK constellation), and a corresponding normalized squaredEuclidean distance of d_(min)²=4Ed_(t)/(2E_(b,avg))=4Ed_(t)/(2E/3)=6d_(t) where d_(t) is the minimumHamming distance of the CTBC code and E=2a² (note that the coordinatesof FIG. 17 are (±a,±a) and its rotation of 135 degrees).

Similarly, two copies of the 8-ary constellation in FIG. 17 can be usedto construct a 16-ary PSK constellation. This 16-ary PSK constellationcan be viewed as being composed of the constellation points from the8-ary PSK (QPSK) constellation as shown in FIG. 17, plus a second copyof the 8-ary constellation shown in FIG. 15 rotated clockwise by 157.5degrees. The coding of the LSBs of the resulting 16-ary PSKconstellation points come directly from these two copies of FIG. 17,while the MSB the constellation points in the resulting 16-PSKconstellation are set to “0” for the original points from FIG. 17, andthe MSB is set to “1” for the copy in FIG. 17 that was rotated clockwiseby 157.5 degrees. With a CTBC code and a properly designed Γ with m=4along with this 16-ary constellation is capable of achieving a squaredminimum Euclidean distance of D_(min) ²=4Ed_(t), and a normalizedsquared Euclidean distance of d_(min)²=4Ed_(t)/(2E_(b,avg))=4Ed_(t)/(2E/4)=8d_(t).

It can be seen from FIGS. 1 and 3 and the discussion above that the sameprocedure can be extended to systematically construct any 2^(m+1)=2M-aryPSK constellation by simply i) making a first copy of current2^(m)=M-ary PSK constellation, ii) creating a second copy of thiscurrent M-ary PSK constellation by rotating the first copy of the M-aryPSK constellation clockwise by [180−(90/2^(m−1))] degrees, and then iii)merging these two copies together by assigning MSB=0 to the first copyand MSB=1 to the second copy to form the resulting 2M-ary PSKconstellation. Applying this process to FIG. 17 gives the 16-PSKconstellation of FIG. 18.

It is interesting to compare the above CICM mapped 16-ary-PSKconstellation with the above CICM mapped 16-QAM constellation that iscapable of achieving d_(min) ²=4Rd_(t). The CICM-mapped 16-PSKconstellation can be designed to achieve d_(min) ²=8Rd_(t) and thus toperform better than the 16-QAM constellation over both Gaussian channelsand fading channels. In fact, if the frame size is large enough so thatΓ can be designed to meet the CICM interleaver constraints, then d_(min)² of CTBC codes can be increased by increasing the order of signaling M.With the above construction for building reverse Gray coded PSKconstellations, each time the PSK constellation size, M, is doubled, theresulting 2M-ary PSK constellation will maintain the same D_(min) ²value as the original 4-PSK constellation. However, as M increases, bothD_(min,n) ² and d_(e) also increase, thereby adding more and moresequences the sequences i_(P) into the table P(d≧d_(t)), and the numberof available columns of Γ, K/m, also decreases. As a result, it becomesmore difficult to design a valid Γ to achieve higher values of D_(min,n)² without increasing the frame size. Hence, in practice, differentorders of signaling can be tested and the best possible order in termsof d_(min) ² can be chosen. In addition, compared with the 16-QAMconstellation, the PSK constellations comes with additional advantagesdue to their constant envelope property. The constant envelope propertyoffers the scheme with a simpler (inexpensive) power amplifier at thetransmitter and a simpler CSI recovery and equalization at the receiver.

The construction of the interleaver Γ of the 8-ary constellation in FIG.17 and the 16-ary constellation in FIG. 18 follow from that of theconstruction in the QPSK example. For example, in the 16-ary case inFIG. 18, Γ should be constructed with four rows and K/4 columns. Keepingthe radius of the circle in FIGS. 17 and 18 the same as that of FIG. 15which is √{right arrow over (2)}a, the bit energy is E=4a². Also, inFIG. 18, all single bit changes in the third bit position (out of 4)achieves the highest squared Euclidean distance contribution of 8a²=4E.Similarly, the squared Euclidean distances all single bit changes at thefirst, second and fourth bit positions can be easily calculated usingthe cosine theorem as 7.6955a²=3.8478E, 6.828a²=3.4142E, and 4a²=2Erespectively. Hence, in the construction of Γ, the third row should getthe highest preference, then the next levels of preference respectivelyfollow the first, second and the fourth rows. With these individualcontributions the same procedure used in the QPSK example can be used toconstruct Γ with the 16-ary constellation in FIG. 18.

Alternatively, if the reverse Gary coded bit vector of FIG. 18 iswritten as [b3 b2 b1 b0], then if all the bits are consistentlyrearranged as [b1 b3 b2 b0], then first row of Γ would correspond to asquared Euclidian distance of 4E, the second row 3.8478E, the third row3.4142E, and the last row, 2E. Hence it should be understood that theconstellation mappings can be modified or the rows of Γ can be assignedto bits of the constellation to ensure that the upper rows areassociated with the higher distances. Also, while reverse Gray coding ispreferable in many cases, all that is really needed is that thedistances between the codewords that differ by just one bit are assignedas high of a distance as is practical or possible (so that knownanti-Gray coding would also be a possible constellation mapping rule foruse with CICM). In some cases the distances between codewords that havedifferences in more than one bit can vary from embodiment to embodimentand do not need to progress in any prescribed way.

Application of CICM to More General Codes:

In the CICM mapping rule design algorithms discussed above, thepermutation matrix, Γ, was designed to map coded bits of a CTBC code onto a higher order constellation. While the above CICM mapping ruledesign algorithms map the coded bits from a CTBC code onto a targetconstellation, many aspects of CTBC codes were not required in the abovepresented design algorithms. The CICM mapping rule design algorithm madeuse of the tables table P(d≧d_(t)), d=d_(t),d_(t)+1, . . . , d_(f).Hence, the CICM mapping rule design algorithm can be easily extended towork with any type of an outer code for which the tables P(d≧d_(t)) canbe prepared. All that is needed to do this is to have the ability toidentify the low weight sequences. Note that typical BICM systems can beviewed as an outer code, that feeds into a uniform interleaver, and theoutput of the interleaver feeds into a constellation mapper that takesthe place of an inner code. Therefore, many different types of codes canbe used as outer codes, and if these outer codes can be used to preparethe tables P(d≧d_(t)), then the uniform interleaver in the BICM can bereplaced by the CICM interleaver, Γ. These outer codes include blockcodes, convolutional codes, turbo product codes, and others.

For example, consider a system that involves a simple (8,4) extendedHamming code with d₀=4, and that feeds ρ codewords of this code into aninterleaver before constellation-mapping onto a QPSK symbol stream. The(8,4) code has the all zero codeword, the all ones codeword and 14weight d=4 codewords. Therefore, the table P(d≧d_(t)) of this (8,4) codeup to weight d=8 will contain (a) 14 codewords of each of the ρcodewords (14ρ in total), (b) all ones codewords of each of the ρcodewords (ρ in total), and (c)

$\begin{pmatrix}\rho \\2\end{pmatrix}*14^{2}$

combinations or two codewords each with weight d=4. If the weight onP(d≧d_(t)) needs to increase, we can extend the table to a desiredweight.

Similarly, if there is a way to identify the lowest weight sequences andto thus prepare a corresponding table P(d≧d_(t)), the same method can beapplied to other kinds of codes such as various types of convolutionalcodes. Additional gains can be achieved by using the permutation Γ thatis chosen in accordance with the CICM interleaver constraints torearrange the coded bits of the outer code to form symbols fortransmission. It is interesting to note that, even with a relativelysimple outer code, Γ can be designed to work with a target signalconstellation in order to achieve very good performance. The systematicdesign of Γ and a signal constellation mapped according to a properlyidentified constellation mapping rule (such as a reverse Gray coded(RGC) constellation mapping rule) to allow the CICM mapping approach tobe applied in a variety of situations beyond CTBC encoded applications.

As of present, it is difficult to enumerate all of the low weightcodewords of turbo codes and LDPC codes that have large frame sizes.However, the CICM approach can be applied to certain turbo codes andLDPC codes with small to moderate frame sizes where the low weight errorsequences can be enumerated and thus the table P(d≧d_(t)) can be found.All that is needed to apply CICM is the table P(d≧d_(t)) can be built.Also, for larger frame sizes, if exhaustive algorithms or other kinds oflong-running, off-line algorithms are used to identify the low weighterror sequences to build the table P(d≧d_(t)), then a CICM mapper can bedesigned for any such turbo code or LDPC code for which the tableP(d≧d_(t)) has been constructed.

Puncturing, and Variable Redundancy:

Two example approaches are provided below in order to achieve variableredundancy (also known as Rate Matching) in systems that use CTBC codes.

1. Puncturing: In this approach, we first consider a concatenation witha low rate OBC and an accumulator. Then in order to adjust the rate,puncturing is performed at the output of the accumulator. It is wellknown that a low rate OBC usually comes with a high MHD d₀. Even astandard CI-2 would square the effect of this increase in the MHD. Forexample, consider a (8,4) OBC with d₀=4, which can be used with anaccumulator and a CI-2 interleaver to construct a concatenation withrate ½ that has MHD=16. Consider a (12,4) shortened BCH code derivedfrom a (15,7) BCH code with d₀=5. If this (12,4) OBC is used the sameway to construct a concatenation, and if the frame size is large enough,the resulting concatenation can achieve a MHD of 25. However, to bringthe rate up to ½, puncturing can be applied to puncture out, on theaverage, one bit out of three bits, at the output of the accumulator.This puncturing can be done in an optimal manner by following theconstruction of the interleaver Γ. A set of K/3 coded bits is selectedat the output of the accumulator that would maintain the highest MHD ofthe punctured code. This can be done by trying to preserve most of thecoded bits of low weight coded sequences by monitoring the non-zeropositions of the low weight sequences as in the construction of Γ. Also,during the execution of the CICM mapping rule design algorithm, theorderings of the sets E(d) and H(d) and/or the constraints used to placethe positions in these sets and related sets of positions could bepreferably placed to maintain higher values of d_(s,t) and D_(min,n) ²within a subset containing K/m/3 columns of Γ than are achieved in therest of the columns of Γ.

That is, the puncturing and the design of Γ to assign the puncturedcoded bits to bit positions within symbols can be done jointly. As anexample, consider the above stated concatenation of a (12,4) OBC and anaccumulator. Following steps 1 and 2, we can first form the set ofdistinct positions of all lowest weight sequences, E(d). If the lengthof E(d) is more than 2K/3, there is no way we can remove N/3 bits at theoutput of the accumulator while preserving the overall MHD of thepunctured code at d=d_(t)=25. However, if the length of E(d) is lessthan 2K/3, there is a chance. The goal is to identify a set of positionsthat can later be punctured so as to maintain the highest possible MHDafter puncturing. This can be done using the sequences that are neededto build Γ, that is, the sequences i_(P) in the tables P(d≧d_(t)), i.e.,the sequences in the tables P(d) where d_(t)≦d≦d_(f). If the MHD can bemaintained at d_(t), no positions of E(d_(t)) can be removed. Similarly,one position from each of the sequences in E(d_(t)+1) can be removed. Itwill need to be checked that any position removed from E(d_(t)+1) is notalso a member of E(d_(t)). So, in general if the target MHD afterpuncturing is d_(t)′, then up to (d−d_(t)′) number of positions can beremoved from every sequence in P(d), d>d_(t)′. When selecting positionsto remove, always try to find the least popular newly added positions atevery weight d to thereby affect the least number of sequences in P(d)while also maintaining the desired MHD for lower weights also.

2. Use of a SPC code with the inner code: In this approach, a high rateOBC is used. Then to adjust the rate lower, a (λ+1, λ) SPC encoder isused to further encode the output of the accumulator. As a result, theIRCC is formed by the concatenation of the accumulator and the SPC code.With this construction, the rate of the overall CTBC code can be readilyadjusted by adjusting λ.

CICM Transmitter and Receiver Embodiments:

Referring now to FIG. 19, a transmitter method, apparatus, and/or system1900 involving a CICM signal mapping subsystem is shown coupled via acommunications channel to a receiver method, apparatus or systeminvolving a CICM signal demapping subsystem.

The CICM based transmitter involves a CTCB encoder 1905 that is coupledto a CICM signal mapper that includes a CICM interleaver 1910 that is inturn coupled to a Reverse Gray coded (RGC) constellation mapper 1915.The CTBC encoder block can be implemented using any of the valid CTBCencoder embodiment as discussed herein. The CICM interleaver performsinterleaving in accordance with an CICM interleaver rule Γ that isdesigned as discussed herein to meet one or more CICM interleaverconstraints, to include CICM interleaver rule to permute the coded bitsof the vector v, subject to the constraint that, once mapped, the CICMmapped sequence will exhibit the best set of values of d_(s,t), andD_(min) ² that can be achieved for a given frame size and for a givenconstellation size and the RCG constellation mapping rule. Also, Γ canbe designed to meet subordinate types of constraints such asinter-column and inter-sequence constraints as discussed herein. The RCGconstellation mapper maps, for example in accordance with the QPSK orthe 16-QAM constellations or 8-PSK constellations as shown in FIGS.15-18, or some other constellation, such as 16-PSK or 32-QAM, forexample, that makes use of a RGC mapping policy. Also, othermapper/demapper rules beside reverse Gray coded or anti-Gray codedconstellation mappers could be used, as long as the constellation mapperis able to help meet the MSED requirement at the sequence or codewordlevel.

The CICM based receiver involves a RCG constellation demapper 1920 thatis coupled to a CICM deinterleaver 1925 that is coupled to a CTBCdecoder 1930. The RGC constellation demapper 1920 performs the inverseoperation of the RGC constellation mapper 1905, and in practicalembodiments is used to compute a set of bit metrics for later decodingin a SISO decoder. The CICM deinterleaver 1925 performs deinterleavingin accordance to the inverse of the CICM interleaver rule Γ, which isdenoted as Γ⁻¹. The output of the CICM deinterleaver 1925 is typically aset of bit metrics that are coupled to CTBC decoder 1930. The CTBCdecoder 1930 can be implemented in accordance with any of valid CTBCdecoder embodiment as discussed herein. However, as each pass is madethrough the SISO algorithm implemented in the CTBC decoder, in order tocompute new bit metrics based on the updated extrinsic information, thebits from the v sequence will need to map via the CICM interleaver 1935to the RGC signal constellation information so that the bit metrics canbe updated. The updated bit metrics then pass back through the CICMdeinterleaver 1925 for further SISO decoding in the block 1930.

In the transmitter and/or the receiver 1900, rate matching and otherforms of variable redundancy can be implemented using the (λ+1, λ) SPCencoder at the output of the accumulator inside the CTBC encoder/decoderblocks 1905, 1930. In such embodiments, the IRCC in the CTBC code isformed by the concatenation of the accumulator and the SPC code asdiscussed above. In systems where rate matching and/or other forms ofvariable redundancy functions are designed into the CICM permutationrule Γ, the blocks 1910 and 1925 can be implemented as discussed aboveto cause a subset containing less than the full K/m columns of Γ to betransmitted in any given variable redundancy frame or sub-frame.

For example, consider a case where the full CTBC code, v, will betransmitted as three sub-frames. In this case, the permutation Γ can bearranged to send the first set of K/m/3 columns in a first sub-frame,the second K/m/3 columns in a second sub-frame, and the third K/m/3columns in a third sub-frame. Preferably the columns of Γ are organizedso that the first K/m/3 columns of Γ contain a carefully constructed setof columns that maximize a given performance measure, such as the MHD ofthe CTBC coded vector v, in light of the fact that only the first K/m/3columns of Γ will be available to the SISO decoder 1930. The K/m/3columns of Γ preferably contain a carefully constructed set of columnsthat maximize the MHD of the CTBC coded vector v in light of the factthat only the first 2K/m/3 columns of T will be available to the SISOdecoder 1930. When the final K/m/3 columns of Γ have been transmitted,all of the elements of v will be available to the SISO decoder 1930. Ifnow further redundancy is needed, a retransmission protocol can be usedso that any specified subset of the columns of Γ can be retransmitted tofurther increase the probability of correct decoding of the vector, v.

In embodiments where the CICM interleaver 1910 and deinterleaver 1925are designed to work in variable redundancy systems, there will beadditional control logic associated with the blocks 1910 and 1925 toimplement the variable redundancy protocol. Information at a controlchannel level or some other higher layer such as a radio link layer or aradio physical layer control entity or data stream will be coupled to acontrol element associated with each of the blocks 1910 and 1925, andthese control elements can be considered to be a part of the blocks 1910and 1925 in such embodiments involving rate matching or other forms ofCTBC/CICM adaptive modulation and coding.

In embodiments as mentioned above where some other form of coding isused beside CTBC codes, the blocks 1905 and 1930 can be configured toencode and decode in accordance any selected form of coding for whichthe table P(d≧d_(t)) can be constructed for d=d_(t), . . . , d_(f). Forexample, any type of block code, and most types of trellis codes,convolutional codes, and certain turbo codes and LDPC codes can be usedin the blocks 1905 and 1930 in these types of embodiments to achieve thebenefits of CICM and CICM based variable redundancy as described herein.

In typical CICM communications embodiments, an encoder will be used thatconverts a sequence of input bits to an encoded bit sequence inaccordance with an encoding rule. The encoding rule can be CTBC encodingor could be any other coding rule such as a block code or aconvolutional code or any other code that produces a frame of K ofencoded bits, and where the encoding rule has the property that, for allpossible sequences of input bits, all possible low weight encoded bitsequences, i_(P), of weights d_(t)≦d≦d_(f) can be identified andenumerated, where none of the possible low weight encoded bit sequences,i_(P), can have a weight less than d_(t), and the weights d_(t)≦d≦d_(f)correspond to Hamming distances. Such embodiments will also include aconstrained interleaver that is configured to implement an m×K/mpermutation rule. The m×K/m permutation rule is configured to permutethe K encoded bits of the encoded bit sequence to a sequence of K/mnumber of subsets that each contain m encoded bits. This permutation canbe optionally/preferably implemented using the CICM permutation matrix,Γ. A constellation mapper will then receive the sequence of K/m numberof subsets and use a pre-defined constellation mapping rule to convertthe sequence of K/m number of subsets to a sequence of K/m number of2^(m)-ary signal constellation points. The m×K/m permutation rule andthe constellation mapping rule are jointly selected to ensure that apre-defined target value of MSED is maintained for all of the possiblelow weight encoded bit sequences, i_(P), of weights d_(t)≦d≦d_(f). Them×K/m permutation rule and the constellation mapping rule are preferablyalso jointly selected to ensure that a pre-defined target value ofminimum symbol Hamming distance, d_(s), is maintained for all of thepossible low weight encoded bit sequences, i_(P), of weightsd_(t)≦d≦d_(f). The Hamming distance d_(f) is preferably selected toensure that any possible encoded bit sequence, i_(P), that has a weightd>d_(f) will be guaranteed to have at least the pre-defined target valueof MSED and the pre-defined target value of minimum symbol Hammingdistance, d_(s). In typical embodiments, The constellation signal mapperuses either anti-Gray coding of RGC.

As is discussed in further detail below in connection with FIGS. 21, 23,and 24, the CICM mapping rule/CICM signal mapper can include a spatialmapper. In such cases CICM signal mapper can be viewed as aconstellation and spatial mapper that is configured to couple a frame ofK/m number of 2^(m)-ary signal constellation points through a sequenceof selected ones of a plurality of spatial channels. A spatialmodulation algorithm together with CICM signal mapping is used toidentify the sequence of selected ones of a plurality of spatialchannels so as to ensure that the pre-defined target value of MSED ismaintained for all of the possible low weight encoded bit sequences,i_(P), of weights d_(t)≦d≦d_(f) that traverse the plurality of spatialchannels. In CICM constellation and spatial mappers that also optionallymaintain a minimum symbol Hamming distance, the sequence of selectedones of a plurality of spatial channels is also selected to ensure thatthe pre-defined target value of minimum symbol Hamming distance, d_(s),is maintained for all of the possible low weight encoded bit sequences,i_(P), of weights d_(t)≦d≦d_(f) that traverse the plurality of spatialchannels. The spatial channels can involve channels between a giventransmit and receive antenna in a multi-antenna embodiment, or pathsbetween one or more lasers and a given one of coherent laser-signaldetector. In the case the MIMO/SM laser channels, different filter pathsthrough a discrete time or other type of transmit optical filter bankand the various path through a receive optical filter bank to a selectedcoherent detector make up the MIMO type channel structure. FIGS. 21-24and the discussion thereof provide more details. As discussed in greaterdetail below, a MIMO modulation rule may alternatively be used whereby aplurality of different constellation points are coupled through aplurality of different spatial channels simultaneously.

CICM Mapping Rule Design Algorithm Embodiments:

Referring now to FIG. 20, a CICM mapping rule design algorithm 2000 isprovided to design a CICM mapper 1910, 1915, and its CICM mapper inverse1920, 1925, such as shown in FIG. 19. The method 2000 begins byidentifying a target signal constellation, which can be, for example,QPSK, 16-QAM, 8-PSK, or any other selected signal constellation, suchas, for example a 4-dimensional constellation as used in opticalcommunications which two 2-dimensional constellations are transmittedsimultaneously on the horizontal and vertical polarizations. Once thetarget signal constellation is identified, a RGC constellation mappingrule is also identified to map bits onto the identified signalconstellation. The RGC constellation mapping rule is preferably designedto assign higher distances between constellation points that differ by asingle bit and progressively smaller distances between constellationpoints that differ by more bits up to m-bits. Next the action 2005initializes d_(s,max)=d_(t), d_(s,t)=d_(s,max) and D_(min) ²=D_(min,max)², where these values are selected based upon the identified signalconstellation and the target MHD value, d_(t), of the CTBC codedsequences to be mapped using the CICM mapping rule 1910,1915.

Next control passes to an action 2010 which initializes d to d=d_(t).Control next passes to an action 2015 which determines a set osequences, {i_(P)}=P(d) which includes the positions vectors p(i_(P))for the weight d CTBC coded sequences, i_(P)=0, . . . , N_(P)(d)−1.Other information can be optionally included in the table P(d) such asthe sets of associated vectors v(i_(P)) and c(i_(P)), for example. Also,the newly identified constituent table P(d) can be used to update anaggregate table, P(d≧d_(t)), and the sequences i_(P)=0, . . . ,N_(P)(d)−1 can be added to a larger set of sequences with weightsd≧d_(t), i_(P)=0, . . . , N_(P)(d≧d_(t))−1.

Control next passes to an action 2020 which attempts to place into Γ anyand all of the positions associated with the sequences i_(P) in P(d)that have not already been already placed. As discussed above, all suchplacements are made in accordance with the CICM constraints i.e.,d_(s)(i_(P))≧d_(s,t) and D_(en) ²(i_(P))≧D_(min,n) ². The placements ofthese positions can also optionally be made in accordance with theinter-column and inter-sequence constraints as discussed above.Moreover, any of the swaps discussed above or similar types of swaps canbe made with the goal enforcing the CICM interleaver constraints on allsequences in P(d≧d_(t)) for the current value of d as determined by theaction 2010 or 2035.

Control next passes to an action 2025 which determines whether the CICMinterleaver constraints were able to be achieved in the action 2020. Ifthe CICM interleaver constraints were achieved, control passes to theaction 2035 where the distance, d, is incremented as d=d+1. If the CICMinterleaver constraints were not achieved, control passes to the action2030 where the target minimum symbol hamming distance, d_(s,t) and thetarget normalized minimum Euclidian distance D_(min,n) ² are decreasedto their next lower values that preferably corresponds to their highestpossible values that are lowered relative to their current values.Control first passes out of action 2030 to action 2015 to allow thedesign algorithm to attempt to place the current set of {i_(P)}sequences in P(d) using these lowered values. When this branch is takenout of the action 2030, the action 2030 preferably removes alreadyplaced positions that can be removed from Γ without violating the CICMinterleaver constraints subject to the lowered d_(s,t) and D_(min,n) ²values. If the action 2030 is reentered after this attempt fails, thesecond branch out of the action 2030 will be taken to restart thealgorithm 2030 at the action 2010 using the original d=d_(t) value.

Control passes out of action 2035 to action 2040 where it is determinedif the incremented value of d is greater than d_(f). If d is not greaterthan d_(f), then control passes from the action 2040 to the action 2015.If d is greater than d_(f), then control passes from the action 2040 tothe action 2045. This logic ensure that the algorithm is allowed to runfor d=d_(t),d_(t+1), . . . , d_(f). The action 2045 provides a validCICM permutation matrix, Γ, which identifies the CICM interleaver rule.

CICM-DCI Embodiments:

Next consider the problem of designing vectorizable permutations for theCICM permutation, Γ. The CICM permutation has been defined in terms ofthe m×K/m permutation matrix, Γ. If the m×K/m permutation matrix, Γ, isviewed as

Γ=[Γ₁Γ₂ . . . Γ_(K/m) ]εZ ^(m×K/m),  (26)

where each Γ_(j)εZ^(m), for j=1, . . . K/m, then one can define

$\begin{matrix}{\Gamma_{DCI}^{\prime} = {\begin{bmatrix}\Gamma_{1} & \Gamma_{{K/{({M*m})}} + 1} & \ldots & \Gamma_{{{({M - 1})}{K/{({M*m})}}} + 1} \\\Gamma_{2} & \Gamma_{{K/{({M*m})}} + 2} & \; & \vdots \\\; & \; & \; & \vdots \\\Gamma_{K/{({M*m})}} & \ldots & \; & \Gamma_{K/m}\end{bmatrix} \in {Z^{{K/M} \times M}.}}} & (27)\end{matrix}$

The elements of the matrix Γ_(DCI)′ as defined in equation (27)correspond to permutation indices that point back into the vector v. Interms of the CI-2, CI-3, or CI-4 type permutations, c=π⁻¹[v], so thatindirection can be used (π→v→c) to construct another permutation matrix,Γ_(DCI). That is, Γ_(DCI) is defined to be a matrix just like Γ_(DCI)′of equation (27), but whose elements correspond to permutation indicespointing back into the vector c instead of the vector v. The elements ofthe matrix Γ_(DCI) are related to the elements of the matrix Γ_(DCI)′via the constrained interleaver permutation u=π[c]. The reason Γ_(DCI)is defined in terms of the coded bit positions of the vector c isbecause of the way data is stored in the parallel access 2D memory 710,1160, 1240 used within the above described SISO decoders described inconnection with FIGS. 7 and 11-12.

As discussed in connection with FIGS. 7-12 above, just like U=π[C] isdefined to be the matrix representation of the constrained interleaverpermutation u=π[c], Γ_(DCI)=π_(CICM)[C] is defined to be a matrixrepresentation of a permutation from the positions of the c vector tothe CICM symbols arranged similarly to equation (27). However,additional steps will be provided to ensure that the permutation matrix,Γ_(DCI), satisfies the vectorization Constraint 6.

If the CICM permutation matrix, Γ, is already known, then theconstruction of Γ_(DCI) via Γ_(DCI)′ of equation (27) is straightforward. However, in practice it will often be desirable to select thepermutation π_(CICM)[] to satisfy Constraint 6, so that it can befactored as Γ_(DCI)=π_(CICM)[c]=π_(LSB,CICM) ^(πi) ^(row)^(={0, . . . ,K/M−1})[π_(MSB,CICM)[C]], where π_(MSB,CICM)[] representsa single permutation over the integer ring {0, . . . , K/M−1} which isapplied down each column of C, and π_(LSB,CICM) ^(πi) ^(row)^(={0, . . . ,K/M−1})[] represents a set of K/M different permutations,each defined over the integer ring, {0, . . . M−1}, and eachrespectively applied across row π_(MSB,CICM)[i_(row)] of C. Using thesenotations, row and column indices, (i_(row), j_(col)), of the matrix Care permuted to row and column indices, (π_(MSB,CICM)[i_(row)],π_(LSB,CICM) ^(πi) ^(row) [j_(col)]), of the matrix Γ_(DCI). Constraint6 ensures that any given row of elements in the matrix C maps to a rowof the same elements, but in an intra-row-permuted ordering, in thematrix Γ_(DCI). For example, Constraint 6 will require that the entirelast row of the matrix 710 whose row index is i_(row)=4 will permute torow π_(MSB,CICM)[i_(row)] in Γ_(DCI) so that [π(14), π(9), . . . π(39)]will all be on the same row, π_(MSB,CICM)[i_(row)], but with an ascrambled ordering in accordance with an intra-row permutation,π_(LSB,CICM) ^(πi) ^(row) []. An M×M interconnection network similar to730 can be provided to perform each of the needed intra-rowpermutations, π_(LSB,CICM) ^(πi) ^(row) ^(=π{0, . . . ,K/M−1})[].

To understand how the memory 710 and related 2D memories used in theSISO decoder (1160, 1240) can be accessed in accordance with all of theC, U, and Γ_(DCI) matrices, consider an example where the constrainedinterleaver, U=π[C] is a DCI with π_(MSB)[] selected to correspond tothe MSBs of a QPP interleaver as discussed above in connection examplesdiscussed using FIGS. 7-12. In this example, assume that π_(MSB,CICM)[]is also selected to correspond to the MSBs of a QPP interleaver, but onethat uses different parameters, f₁ and f₂, as compared to the QPPinterleaver used to implement π_(MSB)[]. During the first half of theSISO iteration, block 705 counts/increments the row index, i_(row) inaccordance with the QPP ordering of π_(MSB)[]. After the first half ofthe SISO iteration, the block 705 generates a sequence of row addresses,i_(row) using a QPP address generator that implements the permutationrule, π_(MSB,CICM)[]. This allows the individual bits of Γ_(DCI) to beaccessed in the ordering shown in equation (27) and used to update a setof bit metrics as described in further detail below. As each row isaccessed, M (e.g., M=8 way parallelism in FIGS. 7, and 11-12) bits fromM different symbols, Γ_(j)εZ^(m), will be accessed. The intra-rowpermutations, π_(LSB,CICM) ^(πi) ^(row) ^(=π{0, . . . ,K/M−1})[] willbe implemented using the M×M interconnection network 730, 1250 to updatethe receive metrics 1210 which correspond to bit metrics that areupdated as discussed in further detail below. During the second half ofeach SISO iteration, the block 705 acts as a sequential up/down counterthat increments/decrements the row index, i_(row).

As is known to those of ordinary skill in the art, M. Isaka et al., “Onthe iterative decoding of multilevel codes,” IEEE JSAC, Vol. 19, No. 5,May 2001, pp. 935-943 (“the Isaka reference”) teaches know known waysupdate a set of bit metrics when higher order constellations are in use.Using this as a starting point, an aspect of the present invention usesthis concept to update a set of bit metrics after the soft decoding ofthe inner code. Similar to calculating the extrinsic information (LLRvalues) of the input bits of the inner code, the extrinsic informationof the output bits of the inner code can also be found at the same timeduring the soft decoding of the inner code. Just like the calculation ofthe extrinsic information of the input bits, the extrinsic informationof the output bits can again be calculated by considering thetransitions that favor bit 0 and bit 1 separately for the output bit inconsideration. Using these extrinsic LLR values of the output bits ofthe inner code which form the M-ary transmitted symbols, the probabilityof each of the M symbols during every interval can be calculated.

For example, consider the case where 8-PSK is used for transmission. Forexample, in such a system, during any interval, three output bits of theCTBC code, v₁, v₂ and v₃, are used to form a 8-PSK constellation point.Let, Le₁, Le₂ and Le₃ be the extrinsic information of these three bitsfound in the decoding of the inner code. In order to calculate theupdated bit metric of v₁, identify a set of constellation points, S₀,that favor the event that v₁=0 and another set of constellation points,S₁, that favor the event that v₁=1. In this 8-PSK example, the sets, S₀and S₁, will contain four constellation points each. Note that anyi^(th) extrinsic LLR value, denoted Le_(i) can be expressed in terms ofthe probabilities as Le_(i)=ln{{P(v_(i)=0)}/{P(v_(i)=1)}}. Hence,P(v_(i)=0) and P(v_(i)=1) can be expressed as,

P(v _(i)=0)=e ^(Le) ^(i) /(1+e ^(Le) ^(i) ) and P(v _(i)=1)=1/(1+e ^(Le)^(i) ).  (28)

Therefore, for every constellation point s_(j) in S₀ and S₁, theprobability contribution to constellation point s_(j) can be found usingthe extrinsic information from the remaining bit positions 2 and 3 bymultiplying the respective probabilities of the bit positions 2 and 3obtained according to equation (28). Then the bit metric for v₁, b(v₁),can be updated by following equation (27) of the Isaka reference as,

$\begin{matrix}{{b\left( v_{1} \right)} = {\ln\left\lbrack \frac{\left\{ {\sum\limits_{s_{j} \in S_{n}}{{P\left( r \middle| s_{j} \right)}{P\left( s_{j} \right)}}} \right\}}{\left\{ {\sum\limits_{s_{j} \in S_{1}}{{P\left( r \middle| s_{j} \right)}{P\left( s_{j} \right)}}} \right\}} \right\rbrack}} & (29)\end{matrix}$

The same process can be continued to calculate the updated bit metric ofthe other two bit positions v₂ and v₃ as well. For example, to calculateb(v₂) the sets S₀ and S₁ will be defined in accordance with bit v₂instead of bit v₁. The value of b(v₁) in the above equation can beapproximately calculated by considering only the significant term ineach summation which results in equation (18) of the Isaka reference as

$\begin{matrix}{{b\left( v_{1} \right)} \approx {{\max\limits_{s_{j} \in S_{1}}\left\{ {\ln \left( {P\left( r \middle| s_{j} \right)} \right)} \right\}} - {\max\limits_{s_{j} \in S_{0}}\left\{ {\ln \left( {P\left( r \middle| s_{j} \right)} \right)} \right\}} + {2\left( {x_{21} - x_{20}} \right){{Le}\left( v_{2} \right)}} + {2\left( {x_{31} - x_{30}} \right){{Le}\left( v_{3} \right)}}}} & (30)\end{matrix}$

where, x₂₀ and x₂₁ represent the second and third bits (in naturalbinary) of the constellation points S_(a) and S_(b) chosen, in the inset S₀ and S₁ respectively in the maximization. Similarly, x₃₀ and x₃₁represent the third bit positions of S_(a) and S_(b) respectively.

Note that each Γ_(j)εZ^(m) in equation (26) have different weights perelement, e.g., 4E, 3.8478E, 3.4142E, 2E. As discussed above, it isimportant that certain specified elements of various low weightsequences listed in the relevant tables P(d) map to rows of Γ thatcorrespond to the higher weights. Therefore, certain permutationsπ_(MSB,CICM)[] will be favored over others. For example, if aparticular permutation π_(MSB,CICM)[], maps the bulk of the elements ofthe low weight sequences listed in the relevant tables P(d) to thehigher weighted rows of Γ, this permutation will be favored over othercandidate permutations. This criterion can be used to select a goodcandidate permutation, π_(MSB,CICM)[], over other candidates. Forexample, if a QPP permutation is being used to implementπ_(MSB,CICM)[], a set of QPP parameters can be selected based on ameasure of the permutation to permute the coded bits of the low weighterror sequences to the higher weighted rows in Γ_(DCI). Also, thepermutation π_(MSB,CICM)[] can be specially designed as a deterministicpermutation rule that provides a good measure of the mapping of thecoded bits of the low weight error sequences to the higher weighted rowsin Γ_(DCI).

It is also possible to define a modified permutation π_(MSB,CICM)[]that is modified to perform local inter-row permutations, e.g., if m=3,[i_(row), i_(row)+1, i_(row)+2]→[i_(row)+2, i_(row), i_(row)+1]. Thelocal inter-row permutations can be used to find more favorablepermutations to be applied to the columns in accordance with theweighting of the rows of Γ_(DCI). Such permutations could be applied perm rows and per column. That is, different groups of m rows and differentcolumns could be modified on an individual basis. All such modificationsare contemplated; however it is realized that such embodiments involvemore hardware complexity. In the discussion below, a simpler permutationrule design example is provided, but it is understood that suchadditional modifications can be made to improve the ability to find goodpermutations at the expense of additional real-time hardwarerequirements and complexity.

To understand how to design Γ_(DCI), refer again to FIG. 18 and considerthe action 1820. When a CICM-DCI is being designed, the action 1820amounts to first applying π_(MSB,CICM)[] to all columns of C, and thenidentifying row permutations π_(LSB,CICM) ^(πi) ^(row)^(=π{0, . . . ,K/M−1})[] that will accomplish the result of the swapsof FIG. 18, i.e., the row permutations are selected to attempt to meetthe CICM interleaver constraints. In the event that the constraintscannot be met, a modified permutation π_(MSB,CICM)[] as described abovecan optionally be used at the expense of additional hardware complexityto meet the CICM interleaver constraints. Otherwise a differentπ_(MSB,CICM)[] can be selected and/or the parameters d_(s,t) andD_(min) ² can be lowered. Similar to the constrained interleaver designtechniques of FIGS. 8 and 10, a key design concept is to selectπ_(MSB,CICM)[] to be an easily implemented deterministic permutationsuch as a QPP interleaver, and to then use the row permutationsπ_(LSB,CICM) ^(πi) ^(row) ^(=π{0, . . . ,K/M−1})[] to ensure theinterleaver constraints are enforced. The key difference is that in FIG.18, the CICM interleaver constraints are used instead of the CI-3 or theCI-4 interleaver constraints. However, Constraint 6, the vectorizationconstraint is commonly enforced in order to generate a vectorizable DCIembodiment.

Once Γ_(DCI) is designed and is available, the method, apparatus andsystems of 700 and 1100 of FIGS. 7 and 11 can be used to perform SISOiterations with CICM bit metric updating. A SISO iteration begins aspreviously described in accordance with FIGS. 7 and 11. However, afterthe actions 1117 and 1126, a bit metrics update will be preferablyperformed. The bit metrics update is computed by having the addressgenerator 705 increment according to π_(MSB,CICM)[] so that thepermutation-count π_(MSB,CICM)[i_(row)] is produced as i_(row)=0, . . ., K/M−1. Then π_(LSB,CICM) ^(πi) ^(row) ^(=π{0, . . . ,K/M−1})[] isapplied to each π_(MSB,CICM)[i_(row)]^(th) row. This way, the sequencingof actions 1117 and 1126 cause the individual Γ_(j)εZ^(m) vectors withinΓ_(DCI) to be produced. Note that the 2D memory array 710 is thus ableto provide all of the u/v (inner code), c (outer code), and Γ (CICMmapper's) sequencing in a fully parallel/vectorizable manner.

Hence it should be understood that an aspect of the present invention isa parallel, vectorizable DCI that is able to provide three or moredifferent permutation sequences from the same memory, 710. The presentinvention contemplates that such structure and functionality can lead toimproved joint coding and modulation/signal mapping systems withimproved coding performance that is derived from improved Euclidiandistance and/or symbol Hamming distance.

CICM with Unequal Error Protection Embodiments:

Multimedia applications usually require unequal error protection fordifferent types of information streams. For example, in systems whereboth data and Voice over IP (VoIP) packet streams are present, the datastreams and the VoIP streams can have different required levels of errorprobability/error rates. Similarly, in live streaming video, the videostream and the audio stream can have different required levels of errorprobability/error rates. Multilevel codes are often used to provideunequal levels of error protection to different data streams byemploying a more powerful code for the data stream(s) that require ahigher levels of protection.

While the CICM mapping rule design algorithms described above were usedto design mapping rules that had the same error probability for allmessage bits, CICM mapping rules can also be designed to providedifferent levels of error protection for different subsets of themessage bits. For example, before being passed over a data linkconnection, a network layer or link layer interface unit can be used toexamine a packet stream to be sent over the data link/physical channel.The bits in the packet stream may be categorized as packet header bitsand according to packet payload type as indicated by the header bits. Ina given example, header bits and TCP packet payloads could be assigned afirst error protection level, while VoIP and audio payloads could beassigned a second error protection level and video payloads a thirderror protection level. In broadcasting applications, different errorprotection levels could be assigned for use with control bits, audiodata stream bits, and video data stream bits.

To understand the how a CICM mapping rule can be designed to provideunequal levels of error protection, consider a specific example where an(8,4) block code is used to perform the coding and the coded bits ofthis (8,4) code are then constellation mapped onto a QPSK constellationusing a reverse Gray code mapping. In this example, assume that eachframe of N message bits can be divided into a first stream that has N/2message bits and a second stream that also has N/2 message bits. Thefirst stream is assumed to require a lower error probability (highererror protection), while the second stream requires lower level of errorprotection (allows a higher error probability) as compared to the firststream. In this example, K/2=N coded bits from the first stream and K/2coded bits from the second stream are to be transmitted jointly in aframe of K coded bits using QPSK modulation while maintaining the lowererror rate for the first stream. This can be accomplished by placing allof the K/2 coded bits of the first stream on row 1 of the CICMinterleaver, Γ, and all of the K/2 coded bits of the second stream onrow 2. In this example using the (8,4) Hamming code and the reverse Graycoded QPSK constellation, the minimum squared Euclidean distance of thefirst stream will be D_(min,1) ²=4*8a²=32a², while the minimum squaredEuclidean distance of the second stream will be D_(min,2) ²=4*4a²=16a².Hence, instead of using two different codes as is performed in MLCsystems, the CICM interleaver rule can be used to produce unequal errorprotection while using just a single block code applied separately toboth of the bit streams associated with each of the reverse Gray codedbits of the QPSK constellation.

The above example can be extended to any constellation with any numberof data streams. The codewords of the block code of any identifiedstream are permuted to a specified row of Γ so as to meet a desiredminimum squared Euclidean distance. For example, in the above case oftwo streams, if 16PSK or 16-QAM is used, then the first and second rowsof the CICM interleaver matrix may be primarily used for the firststream that requires higher protection while the last two rows can beprimarily used for the second stream.

While the previous used a single block code, this same basic approachcan also be extended to applications where a convolutional code or aconcatenated code, like a CTBC code, is used in lieu of theabove-described (8,4) block code to supply the coded bits to beconstellation mapped, e.g., to a reverse-Gray coded constellation, viaCICM. In general, any of the above-mentioned codes or any other codewhose P(d) tables can be identified can be used. In such situations,CICM is generally designed using a respective set of P(d) tables and bythen ensuring that different low distance error sequences as listed inthe P(d) tables end up achieving corresponding different desired MSEDswhile also maintaining a corresponding symbol Hamming distance, d_(s).The discussion below explains how this is achieved in the context of aCTBC code.

When the coding is performed in accordance with a CTBC code, and when itis desired to provide unequal error protection to different sub-streamsof message bits, it is necessary to first identify a subset of codewordsof the OBC to be used to encode the message bits from the differentsub-streams. For any given sub-stream, there will be an associated setof OBC codewords that will correspond to the inputs of the IRCC that endup generating a corresponding subset of coded bits, v_(s), of the entiresequence, v. Next consider the error sequences that involve any of thecoded bit positions that correspond to elements of v_(s). All lowdistance error sequences that need to be considered will be listed inthe P(d) tables used to design the CICM interleaver, Γ. If all of thelow distance error sequences involving coded bit positions from thesubset v_(s) can be ensured to have a specified higher Euclideandistance, then it is possible to maintain the specified higher level ofprotection for the message bits associated with the corresponding subsetof codewords of the OBC that correspond to the coded bit positions ofv_(s).

Hence, when designing a CICM mapping rule, it is desirable to use rowsof Γ with a higher Euclidean distance for the coded bit positionsinvolved in the error sequences that include any of the elements ofv_(s). Depending on the constellation and the desired number of streams,the low distance error sequences listed in the P(d≧d_(t)) tables can beused to form different groups of coded bits that will need respectiveunequal error protection levels.

In order to systematically select the sets of codewords of the OBC fordifferent levels of error protection, let us consider the case where wehave already constructed a Γ for equal error protection. At this pointthe P(d≧d_(t)) table that lists all the coded sequences of v up toweight d_(f) will have already been prepared. The sequences in theP(d≧d_(t)) table can then be used to calculate the actual SquaredEuclidean Distance (SED) of each coded sequence in the table P(d≧d_(t)),each of which, by construction, must be at least as high as D_(min) ².The goal is to next identify two sets of codewords, CW₁ which containscodewords that have a higher level of protection, i.e, of at leastD_(min,1) ², and CW₂ that contains the remaining codewords which have alower level of protection, i.e, of at least D_(min,2) ², where D_(min,2)²<D_(min,1) ².

At this point some observations will be made that will help to developalgorithm to identify the sets CW₁ and CW₂ given a particular code andgiven a starting CICM permutation matrix, Γ, that was developed forequal error protection. The CICM permutation matrix, Γ, will then needto be modified/adjusted in a way that maintains the symbol Hammingdistance at d_(s) and achieves the targets D_(min,1) ² and D_(min,2) ²for the identified sets CW₁ and CW₂. In order determine how to identifythe sets CW₁ and CW₂ and to modify Γ to achieve these goals, thefollowing observations are made:

Observation 1: Consider any codeword c_(j)=(c_(j0), c_(j1), . . .c_(jn−1)) of the OBC that places its t^(th) coded bit, c_(jt), at thei^(th) position of u at the input of the IRCC, i.e., u(i)=c_(jt).Identify the corresponding v(i) (output of the IRCC) for thecorresponding u(i)=c_(jt). For this c_(jt), identify each sequence i_(P)listed on P(d≧d_(t)) that contains the corresponding position i. Notethat the position i can be listed in multiple sequences contained in thetables P(d≧d_(t)). Next calculate the SED, denoted as SED(i_(P)), foreach identified sequence i_(P) that contains position i. Using all theidentified sequences i_(P) that contain the position i, find the minimumof all the SED(i_(P)) values. Denote the minimum SED(i_(P)) value forposition i which corresponds to c_(j), as D²(c_(jt)). Continue thisprocess for all of the bit positions in the codeword c_(j) for t=0, . .. n−1. Repeat the same process for all codewords c_(j), j=0, . . . ,ρ−1. At this point the squared Euclidean distance of each coded bitpositions of each of the codewords c_(i) will have been computed for allcoded sequences in the tables P(d≧d_(t)). Next find the minimum ofD²(c_(jt)) among all t=0, . . . , n−1, for each of the codeword c_(j) asD²(c_(j))=Min_(t){D²(c_(jt))}. This calculation implies that the currentCICM permutation Γ will cause each codeword c, to have a MSED ofD²(c_(j)).

At this point, it may be possible to choose the group of codewords c_(j)with higher D²(c_(j)) for the set CW₁ and the rest for CW₂, withoutmaking any changes to Γ. If this is not the case,modification/adjustments can be made to Γ in order to increase the MSEDseparation between the sets CW₁ and CW₂.

Observation 2: As stated in observation 1, D²(c_(jt)) is the minimumtaken over all sequences on P(d≧d_(t)) that has position v(i), andD²(c_(j)) is the minimum of D²(c_(jt)) over all t in each codewordposition c_(j). Hence, in order to increase D²(c_(j)), one needs tofocus on the c_(jt,min) that determined D²(c_(j)), i.e.,D²(c_(j))=D²(c_(jt,min)(1)). Note that D²(c_(jt,min)(1)) will have beendetermined by one or few of the low weight coded sequences listed inP(d≧d_(t)). Further, if D²(c_(jt,min)(1)) can be increased up to thenext lowest D²(c_(jt)) value among t=0, . . . , n−1, denoted byD²(c_(jt,min)(2)), then D²(c_(j)) will have been increased up toD²(c_(jt,min)(2)). In order to realize an increase in D²(c_(j)), it willbe needed to judiciously swap some positions in Γ, preferably with thesmallest number of swaps. If possible, one can attempt to increase eachD²(c_(j)) value gradually in n steps up to D²(c_(jt,max)) for allcodewords in a set which would become CW₁, where D₂(c_(jt,max)) is themaximum D²(c_(jt)) among all t=0, . . . , n−1. Each such increase willcome about as a result of a modification/adjustment in Γ.

Observation 3: It was seen in observation 2 above that any D²(c_(jt))can be adjusted performing a sequence of swaps that cause the SED ofselected corresponding coded sequences in P(d≧d_(t)) to be adjusted.Hence, next consider how to perform swaps to change the SED(i_(P)) valuecorresponding to any low weight coded sequence i_(P) listed inP(d≧d_(t)). Due to the assumed previous construction of Γ, all of thepositions of i_(P) will already have been placed into Γ. With respect tothe previously discussed QPSK example, some of the positions of i_(P)could have been placed on row 1 while the rest on row 2. Denote theportion of i_(P) on row 1 by i_(P−1) and the portion of i_(P) on row 2by i_(P−2). Hence, i_(P−1) represents the positions of i_(P) that can beswapped to lower the corresponding SED(i_(P)) value, while i_(P−2)represents the positions of i_(P) that can be swapped to increase theSED(i_(P)) value. If either of the i_(P−1) or i_(P−2) sets are empty,then the corresponding coded sequence i_(P) can only increase (ifi_(P−1) is empty) or decrease (if i_(P−2) is empty) the SED(i_(P))value. Further, for every coded sequence in the tables P(d≧d_(t)), onecan also determine the maximum possible SED that coded sequence canachieve, SED_(max)(i_(P)), which will be realized if all of thepositions in i_(P−2) can be moved to row 1.

Observation 4: Based on observations 1-3 above, the MSED of any OBCcodeword c_(j) will be determined by the D²(c_(jt)) values of one or afew of its coded bits c_(jt). Further, each of the D²(c_(jt)) valueswill be determined by the SED(i_(P)) values associated with each codedsequence i_(Pε){i_(P)(c_(jt))}={i_(P,cjt)(1), i_(P,cjt)(2), . . . ,i_(p,cjt)(k_(jt))}, where k_(jt) is the total number of coded sequencesthat can influence D²(c_(jt)). Hence, it is seen that the MSED ofcodeword c_(j), D²(c_(j)) will be determined by the SED of a particularcoded sequence, for example, i_(P,cjt)(l). The value of D²(c_(j)) can beincreased by swapping one or more positions of i_(P,cjt−2)(l) withi_(P,cjt−1)(l). This will have the effect of swapping positions of thecoded sequence i_(P,cjt)(l) that are currently placed on row 2 withpositions not directly related to i_(P) that are currently placed onrow 1. Before making such a swap, a check can be made to determinewhether the movement of the position currently in row 1 to row 2 willviolate any prescribed conditions. Further, the highest SED thatcodeword c_(j) can reach is D_(max) ²(c_(j)), can also be found byassuming that the SED of the worst case coded sequences related to c_(j)can be increased up to its maximum D_(max) ²(c_(j)) by successfullymoving all of the associated positions from row 2 to row 1. That is, forevery coded sequence i_(P) listed in P(d≧d_(t)), SED(i_(P)) could beincreased up to SED_(max)(i_(P)), if all its positions on the second rowcan be successfully swapped.

Consider a set of n_(s) sequences on P(d≧d_(t)), i_(P,ns)={i_(P1),i_(P2), . . . , i_(Pns)} for which SED_(max)(i_(P))≧D_(min,1) ² for i=1,2, . . . n_(s). Note that some of these coded sequences may have anSED(i_(Pi)) value that is above the threshold D_(min,1) ², i.e.SED(i_(Pi))≧D_(min,1) ². Because the CI will define a bidirectionalpermutation, π, between c_(jt) and its corresponding position i as perv(i), a reverse permutation (inverse constrained interleaving operation,π⁻¹), i.e., v(i)→c_(jt), can be defined which is referred to as“de-permuting” herein. Using this de-permuting process, next find thecorresponding coded bit positions, c_(jt), of each and every positionfound in any of the low weight sequences in the set i_(P,ns). After thatde-permuting process, if all the coded bits of a set of ρ₁ codewords canbe found whose SED(i_(Pi)) values all satisfy SED(i_(Pi))≧D_(min,1) ²,then those ρ₁ codewords can be used to form a set like CW₁ to maintain aMSED of D_(min,1) ².

Therefore, to identify a set of ρ₁ codewords of the OBC for a level ofprotection determined by D_(min,1) ², first find the SED_(max)(i_(P))values for every coded sequence i_(P) listed in the tables P(d≧d_(t)).Next form the set of sequences i_(P,ns) by considering the set of codedsequences i_(P) for which SED_(max)(i_(P))≧D_(min,1) ². Next de-permuteall the positions of all sequences in i_(P,ns). At this point, it isdetermined whether at least ρ₁ codeword positions satisfyD²(c_(j))=Min_(t){D²(c_(jt))≧D_(min,1) ². If not, this means nomodifications/adjustments can be made to the current CICM permutationmatrix Γ in order to cause ρ₁ codewords to achieve a MSED of D_(min,1)². In addition, for all sequences in i_(P,ns), let ch(i_(P)) denote thenumber of positions of the sequence i_(P−2) that need to be moved fromthe second row of Γ up to the first row in order to enforceSED(i_(P))≧D_(min,1) ². This parameter can be calculated asch(i_(P))=(D_(min,1) ²−SED(i_(P)))/4a², because moving each positionfrom row 2 to row 1 increases the SED by 4a² (i.e., from 4a² to 8a²).

Next consider methods 1-4 below that can be used to select the candidatesubsets of codeword positions to be used to construct set CW₁ forunequal error protection. All of methods 1-4 below also be used toconstruct the set CW₂ instead of CW₁. In such cases, the methods 1-4 aremodified by starting with the lowest SEDs instead of the highest SEDs.Depending on the code and the parameters, it is sometimes easier toconstruct CW₂ as opposed to CW₁. The methods are:

1. De-permute the positions of the coded sequences for whichSED(i_(P))>D_(min,1) ². If at least ρ₁ codeword positions satisfyD²(c_(j))=Min_(t){D²(c_(jt))≧D_(min,n) ², identify these ρ₁ codewordpositions and stop. No additional work is needed. The current design ofΓ for equal error protection can also be used for unequal errorprotection.

2. Identify the coded sequences i_(P) with the highest SED_(max)(i_(P))values and de-permute all positions in these coded sequences. Then dothe same for the sequence with the next highest SED_(max)(i_(P)) values.Continue the process by de-permuting sequences one by one selecting thesequence with the highest SED_(max)(i_(P)) value. Stop when ρ₁ suchcodeword positions of have been identified. At that point, ρ₁ codewordpositions for the set CW₁ will have been identified. Also identify thehighest possible D_(min,1) ² value which will be equal to the lastSED_(max)(i_(P)) used to construct the set CW₁. This method can be usedwhen the highest possible D_(min,1) ² is required.

3. De-permute the positions of the coded sequences for whichSED(i_(P))>D_(min,1) ². If no set of ρ₁ codeword positions satisfies thecondition of method 1, de-permute one coded sequence at a time startingfrom the coded sequence with the lowest ch(i_(P)) and moving to nextcoded sequence with the next lowest ch(i_(P)). Continue this processuntil all coded bits of ρ₁ codeword positions are observed in the set ofde-permuted coded bits. The purpose of this approach is to find the setCW₁ to lower the number of swaps needed.

4. De-permute the positions of the coded sequences for whichSED(i_(P))>D_(min,1) ². If no set of ρ₁ codeword positions satisfies thecondition of method 1, find the codeword position c_(j) that satisfySED(i_(P))>D_(min,1) ² and has the highest number of coded bitpositions. Permute the remaining coded bits of that codeword on to v(i.e. find the corresponding v(i)'s). Find the sequence i_(P) thatcontains each v(i) with the smallest ch(i_(P)) value. De-permute all ofthe positions, i, in that i_(P). Note that de-permuting of i_(P) canfill in other coded bits of remaining codewords too. Continue theprocess until ρ₁ codewords have been filled. This approach tries toidentify codeword positions c_(j) that have most of their coded bitpositions c_(jt), that satisfy SED(i_(P))>D_(min,1) ².

Using a candidate set of codeword positions identified in one of themethods 2-4, next consider how to identify swaps that are used tomodify/adjust Γ so as cause at least ρ₁ codeword positions to satisfyD²(c_(j))=Min_(t){D²(c_(jt))≧D_(min,1) ². Note that when a position thatis placed on the first row is moved down to the second row, all codedsequences i_(P) that include the coded bit position being swapped willlower their SEDs by 4a² (from 8a² to 4a²). Hence, to identify a positionthat can afford to tolerate that swap, look at all coded sequences inP(d≧d_(t)) that include the candidate position to be swapped and makesure that all such sequences can afford to lower their SED by 4a² andstill maintain the required MSED values of CW₁ and CW₂ (D_(min,1) ² andD_(min,2) ² respectively). Therefore, to prepare a list of validpositions to swap:

-   -   1. Identify coded sequences in the tables P(d≧d_(t)) that can        afford to lower their SED.    -   2. For each selected sequence, search through each position on        the first row to determine whether all other sequences that        contain that position and can afford to lower their SEDs also.        If so, add that position to the list of valid positions to swap.    -   3. Repeat steps 1 and 2 for each position of each coded sequence        that can afford to lower its SED.

However, it is also important to note that each coded sequence i_(P) canonly afford move up to a maximum number of positions from the first rowto the second row of Γ. Based on the positions involved in the codedsequence, that coded sequence will need to maintain a SED of D_(min,1) ²or D_(min,2) ². This is because when positions of the sequence i_(P) arede-permuted, if all of the positions in i_(P) fall into CW₂, then i_(P)needs to only maintain a SED(i_(P)) of at least D_(min,2) ². However, ifeven a single de-permuted position falls into CW₁, then i_(P) needs tomaintain a SED(i_(P)) of at least D_(min,1) ². Hence, if the current SEDof the coded sequence is SED(i_(P)) and the sequence is required tomaintain a SED of D_(min,1) ², then it can only afford to movenpos(i_(P))=[SED(i_(P))−D_(min,1) ²]/4a² positions from it. Once CW₁ andCW₂ are identified, it is possible to find all npos(i_(P)) values forall coded sequences i_(P) in P(d≧d_(t)). Note that if npos(i_(P)) numberof positions of a sequence i_(P) are swapped, then more positions ofthat sequence cannot be swapped and all remaining positions of thatsequence should be discarded from the list of valid positions. Also notethat the valid list of positions to swap is formed by positions fromsequences i_(P) that de-permute to coded bit positions in CW₁ and/or CW₂that have relatively high SED values. Further, when swapped with thepositions from the list, it is seen that the D²(c_(j)) values that arelower will increase while those that are higher (which are likely torepresent the list of valid positions) will start to decrease. Note thatwhen D_(min,1) ² is higher more swaps will likely be needed. That meansa longer list of valid positions will be needed. The longer the list ofvalid positions is, the more likely that D_(min,2) ² will need to belowered so as to create more possibilities for positions to be movedfrom row1 to row2 in CW₂.

One other important point is that when D_(min,1) ² is higher than theD_(min) ² used to construct the initial Γ for equal protection (which isusually the case), d_(f) will also need to be adjusted according toD_(min,1) ². Each time d_(f) is increased, this will cause moresequences to be added to P(d≧d_(t)). All added sequences need to beconsidered while Γ is modified/adjusted to accommodate unequal errorprotection. Hence, a method to design Γ for unequal error protection canbe outlined as follows:

-   -   1. Select the potential sets CW₁ and CW₂ using any of the above        methods 1-4.    -   2. Expand P(d≧d_(t)) as needed by adding more coded sequences to        it to match D_(min,1) ².    -   3. Prepare the list of valid positions for swapping as described        above.    -   4. Start swapping positions. For each position selected to move        from row 2 to row1 find a partner from the list of valid        positions for swapping. Note that whether the pair of positions        come from different columns or from same column the symbol        Hamming distance will not be affected from that swap. It is        desirable to swap one pair at a time targeting increasing the        MSED of the set CW₁ while trying to not lower the MSED of CW₂.        When swapping is done one pair at a time, the swaps are        preferably selected to cause the SEDs of all codewords in each        set to be more similar to each other and closer to the MED of        that set.

The quality of Γ:

Note that the design of Γ for both equal and unequal error protection issuboptimal in the sense that for example, we cannot claim that thedesigned Γ achieves the highest possible D_(min) ² for equal errorprotection. Then how good is the designed Γ? After the above discussion,we can answer that question at least to some extent. If all codewordsSED(c_(j)) have very similar values close to D_(min) ² it can beconsidered to be a good design. If not and SED(c_(j)) values vary a lotthe design likely has room for further improvement. That is it may beprobable that the low SED(c_(j)) values could be increased by loweringthe high SED(c_(j)) values. This can be done by swapping positions thatof the already constructed Γ. That is, the codewords with lowerSED(c_(j)) values could be increased by using the same method describedabove by finding the list of swapping positions and then swappingpositions. Hence, the unequal error protection design method describedherein can be considered as a fine tuning process in the design of Γ.Interestingly any design of Γ to begin with can be used for the tuningup process. However, it is desirable to start with a Γ design thatsatisfies the symbol Hamming distance condition. Then the fine tuningprocess can be used to achieve a high D_(min) ² value whilesimultaneously maintaining the symbol Hamming distance condition.

Similarly, in the unequal error protection applications, a good Γ designshould maintain similar SED(c_(j)) values all close to D_(min,1) ² forthe set CW₁ and similar SED(c_(j)) values close to D_(min,2) ² for allc_(j) in CW₂. The above fine tuning process outlined for the design oftwo sets can be continued until similar SEDs in the two sets arereached.

It can be noted that other variations are possible. For example, if aconstellation is being used that has four levels, it may be desirable toapply a strong code such as a CTBC code to only encode the first levelor the first two levels, for example. Then weaker codes such as blockcodes could be used to encode the third and and/or fourth levels. Bydoing so, we can use both the codes and the design of Γ to generate abigger separation between the levels of protection than by using thesame code and using only Γ to provide different levels of protection.

In the previous discussion, we considered how to design Γ to provideunequal error protection using an already-designed constrainedinterleaver of a given CTBC code. It is also possible to design theconstrained interleaver used in the CTBC code from the get go in orderto make the design of the corresponding Γ simpler. For example, theconstrained interleaver used in the CTBC code can be specificallydesigned using one or more additional constraints that causes each ofthe sequences listed in the P(d≧d_(t)) tables to have all of theirpositions de-permute to either CW₁ or CW₂, but not both. If theconstrained interleaver of the CTBC code is constrained in this way, theΓ design for unequal error protection as described above becomes muchsimpler.

Hence, the unequal error protection constraints used in the CTBC code'sconstrained interleaver design will ensure that no low Hamming weightsequences in the listed on P(d≧d_(t)) tables are generated bycombinations of codewords from CW₁ and CW₂ jointly. That is,combinations of codewords from only CW₁ and only CW₂ are allowed togenerate the low weight sequences of v, but combinations from both CW₁and CW₂ are not. With this additional constraint, the constrainedinterleaver will not allow any combination of codewords from CW₁ and CW₂to generate sequences of v with weight less than d_(f). This ensuresthat every sequence listed in the P(d≧d_(t)) tables will have all oftheir positions de-permute to either from only CW₁ or only CW₂.

One way to implement this additional interleaver constraint is to startby arbitrarily selecting ρ₁ codewords for CW₁. Then instead of placingone coded bit of every codeword (as described before in the CTBC code'sconstrained interleaver design), place all coded bits of the ρ₁codewords in CW₁ into v to maintain the desired MHD d^(t). This can bedone by placing one coded bit at a time of the codewords in CW₁ into vas described above in connection with the CI-3 and CI-4 constrainedinterleaver design methods, for example. Next place coded bits of CW₂into v in such as way as to maintain the desired MHD (of preferablyd_(t) or lower if necessary since the MHD of CW₂ will be lower thand_(t)). However, while placing coded bits of CW₂, ensure that anycombination of codewords that involve codewords from CW₁ and CW₂ end upgenerating a high MHD (preferably at least d_(f) calculated according toD_(min,1) ². If necessary, allow only as few of sequences of v aspossible with lower weights (lower than d_(f)) to involve fromcombinations of positions that de-permute into CW₁ and CW₂. In suchcases where it was not possible to completely separate CW₁ and CW₂, mostof the sequences on P(d≧d_(t)) will be from either only from CW₁ or onlyfrom CW₂ with only few from both CW₁ and CW₂. With such a P(d≧d_(t))table, it becomes easier to design Γ using the approach as explainedabove for unequal error protection. If it was possible to completelyseparate CW₁ and CW₂, then it becomes much easier to design Γ.

Alternatively, starting from any CTBC code's constrained interleaver,positions can be swapped on u (therefore on v also in the correspondinglocations) to try to move towards a situation where the sequences onP(d≧d_(t)) are completely separated in accordance with CW₁ and CW₂ asdescribed immediately above. That is, for each low weight sequence onP(d≧d_(t)), swaps are performed to move positions in u so that a givenlow weight sequence either becomes a high weight sequence or becomes alow weight sequence but whose positions come from only CW₁ or CW₂.

CTBC codes that use CICM with unequal protection can thus be designed bydesigning both the CTBC code's constrained interleaver (u=π[c]) and Γ asdiscussed above. Also, using the same concepts, the CTBC code'sconstrained interleaver and the F constrained interleaver can designedjointly. If the CTBC code's constrained interleaver cannot be designedto have all of the respective positions of each respective low weightsequence in the P(d≧d_(t)) tables to de-permute to completely separatedsets CW₁ and CW₂, then the separation that could not be carried out inthe CTBC code's constrained interleaver can be carried out during thedesign of the Γ constrained interleaver. Similarly, if the Γ designbecomes difficult with not enough positions to swap, then the CTBCcode's constrained interleaver can be adjusted to help the Γ design.This way, the CTBC code's constrained interleaver and the Γ constrainedinterleaver can be designed and adjusted jointly. Swaps or other designsteps can be carried out in one constrained interleaver design algorithmuntil a limiting condition is encountered. Next at this time a jointdesign algorithm switches over to the other interleaver and performsadjustments there, until another limiting condition is encountered. Thenthe joint design algorithm switches back to the first constrainedinterleaver design algorithm, and so on, until all of the constraints ofboth the CTBC code's constrained interleaver and the Γ constrainedinterleaver are jointly designed/adjusted to meet all of the interleaverconstraints. Once the interleaver constraints are met, the symbolHamming distance and the multiple MHD requirements of the unequal errorprotection coding scheme will be satisfied.

It should be noted that any of the embodiments that use unequal errorprotection as described above can be used in accordance with FIG. 19 andFIG. 20 and the discussion thereof. The modifications of enforcingadditional constraints can be applied to FIG. 19 and FIG. 20 to providealternative embodiments of transmitters, receivers/decoders, andcontention-free deterministic constrained interleavers, Γ_(DCI).

MIMO and Spatial Modulation:

MIMO systems employs n_(t)>1 transmitting antennas at the transmitterand n_(r)>1 receiving antennas at the receiver. A fading or stationarychannel can be described by an n_(r) by n_(t) channel matrix H. Anyj^(th) column of H represents the channels from the j^(th) transmittingantenna to each of the receiving antennas. The channel matrix H can betransformed to show that the channel can be represented in terms ofn_(min)=min(n_(t), n_(r)) number of independent data streams. MIMOmodulation rules allow n_(min) number of constellation points to betransmitted simultaneously on the MIMO channel. Since each such datasteam is capable of carrying m number of bits per interval using asignal constellation with M=2^(m) constellation points, the resultingMIMO system using the MIMO modulation rule is capable of transmittingmn_(min) number of bits per interval. Hence, a MIMO system can transmitmn_(t) bits per interval as long as n_(r)≧_(t). Therefore, by increasingthe number of antennas, it is possible to increase the throughput of thesystem by increasing the transmitted data rate of a MIMO system. TheV-BLAST system developed by the Bell Labs is such a system that canincrease the data rate. In V-BLAST, the transmitted signal during anyinterval is a combination of symbols, s₁, s₂, . . . , s_(nt), which canbe represented using a vector x=[s₁, s₂, . . . , s_(nt)], where, eachs_(j), j=1, 2, . . . , n_(t) is an independent symbol selected fortransmission from the set of symbols {s}=(s₁, s₂, . . . , s_(m)) used inthe M=2^(m)-ary constellation. In matrix-vector notation, the receivedcolumn vector y is formed by all received signals from all n_(r)antennas at the receiver during a frame interval. Hence, in presence ofa noise vector w, y can be expressed in matrix-vector notation as

y=Hx+w  (31)

where H corresponds to a MIMO channel matrix. Various forms ofmathematical MIMO channel models are well known to those of skill in theart. For a more detailed discussion of the MIMO channel model, see G. R.Raleigh and J. M. Cioffi, “Spatio-temporal coding for wirelesscommunication,” IEEE TR Comm. Vol. 46, No. 3, March 1998, pp. 357-366(“the Raleigh reference). In the Raleigh reference, the MIMO channelmatrix is representative of a channel for sending an entire frame ofinformation. This channel matrix type can be used to model a transmitand receive filter pairs that effectively exist among and between thedifferent transmit and receive antenna channels. In such channel models,the vector y corresponds to an entire frame of data. The goal at thedetector is to estimate {circumflex over (x)} from y thereby estimatingeach transmitted symbol s_(j) from antenna j. It is assumed that thechannel state information (CSI) is available at the receiver, i.e., thereceiver knows the channel matrix, H. Channel state information derivedat the receiver used to estimate the channel matrix, H.

In V-BLAST channel coding is not applied across data streams. Instead,the data streams coming from each antenna can be viewed as being stackedup in time domain vertically, which is what the V stands for in V-BLAST.As a result, V-BLAST can suffer under slow flat fading as some of thedata streams can be severely faded. In order to overcome this drawbackof V-BLAST, D-BLAST (diagonal BLAST) has been proposed. In D-BLAST everycoded block of a data stream is spread over all the antennas. D-BLASTalso does not transmit from the beginning of each frame so thatinterference between coded blocks can be cancelled at the receiver moreeffectively. While not transmitting from certain antennas at thebeginning of selected intervals lowers the throughput of D-BLASTcompared with V-BLAST, it makes successive interference cancellation(SIC) in the receiver/decoder as discussed in further detail below,easier and more efficient.

Spatial modulation (SM) (also called SM-MIMO) is a technique that usesthe spatial domain to transmit information. As opposed to V-BLAST whichassumes all of the transmitting antennas are transmittingsimultaneously, SM selects one antenna among n_(t) availabletransmitting antennas for transmission during each symbol interval. SMuses log₂(n_(t)) number of bits from the data stream to select oneantenna out of the n_(t) antennas to transmit an m-bit symbol duringeach symbol interval. Therefore, SM is able to transmit a total of[log₂(n_(t))+m] number of bits each symbol interval. That is,log₂(n_(t)) bits are transmitted in the spatial domain (antennaselection) while m bits are transmitted in the signal domain (symbolselection). Stated another way, SM transmits log₂(n_(t)) bits on thespatial constellation (which is the available set of antennas) whilesimultaneously transmitting m bits on the signal constellation (which isthe available set of signaling points).

The SM receiver is capable of identifying, for each symbol interval,both the transmitting antenna and the transmitted symbol. This isaccomplished by observing and processing the received signal array yover an entire coding frame of length K. The receiver is able todetermine the transmitting antenna because each transmitting antenna, asviewed by the full set of n_(r) receive antennas, has its own electronicsignature that can be used to differentiate between antennas. Thesignature of a transmitting antenna as observed by the full set of n_(r)receiving antennas comes from the transmitted signal and the fading orother channel effects from the channel matrix, H, between thetransmitting antenna to each receiving antenna. One significantadvantage of SM over regular MIMO is that it uses only one RF signalduring any interval. As a result there is no interference between thedifferent transmitting antennas in SM as in MIMO. This lack ofinter-antenna interference and the lack of a need to compensate for itat the receiver is the reason that SM receivers are much simpler thanMIMO receivers such as V-BLAST. SM thus trades off some spectralefficiency for a much simpler receiver design and much better energyefficiency in terms of battery life and the like once processing-relatedpower consumption is taken into account. The SM receiver must detectwhich of the transmit antennas sent the symbol (estimation of a spatialconstellation coordinate) and must detect the symbol that wastransmitted from that antenna (estimation of one or more signalconstellation coordinates). For example, the ML (maximum likelihood)detection of SM requires minimization of the squared Euclidean distance,i.e., min∥y−H{circumflex over (x)}∥², over the set of antennas, {c}={c₁,c₂, . . . , c_(nt)}, and the set of symbols {s}={s₁, s₂, . . . , s_(M)},where the {circumflex over (x)} is the ML estimate of x in equation(31). Hence, ML detection of SM requires only Mn_(t) number of Euclideandistance checks as opposed to the M^(nt) needed in a full MIMO system,thereby greatly reducing receiver complexity.

Various types of the SM systems are known to those of skill in the art.For example, space shift keying (SSK), space time shift keying (STSK),and generalized spatial modulation (GSM) are all reported in theliterature. In SSK, only one signal is transmitted making m=0. That is,the information in SSK is transmitted completely from the selection ofthe antenna. In STSK, the role of the selection of antenna isgeneralized to a role of selecting one of a pre-selected set ofdispersion matrices that can have channel response effects that spanmultiple symbol intervals. When Q number of dispersion matrices areused, STSK can transmit log₂(Q) bits over the channel response durationof the dispersion matrices. In GSM, more than one antenna is selectedfor transmission during each symbol interval, thereby increasing numberof bits that can be transmitted over the spatial domain. Hence, GSM canbe viewed as a combination of SM and MIMO. If n_(m)<n_(t) number ofantennas out of all n_(t) antennas are selected for transmission, GSM iscapable of transmitting up to

$\log_{2}\left( \begin{pmatrix}n_{t} \\n_{m}\end{pmatrix} \right)$

bits from the spatial constellation and n_(m)*m bits from the signalconstellation in any given interval. However, as in MIMO, the signalstransmitted from the different antennas interfere in GSM, and hence GSMrequires a more complex receiver as compared to pure SM where only onetransmit antenna is active at a time.

Consider the case of SM where only one transmit antenna is active at anygiven time. Assume that CSI can be estimated at the receiver and madeavailable to the receiver signal processor. Besides the ML detection, asimpler two step detection process is known that first detects thetransmitted antenna in the spatial constellation and then detects thesignal transmitted in the signal constellation. Similarly, the abovereceiver structures can be extended for soft detection. For example, inML soft decoding, all the (n_(t)M) Euclidean distances that correspondto different transmitted bit combinations, can be used to calculate theL values (i.e, the log likelihood values, where, for the j^(th) bit,L(b_(j))=log(Pr(b_(j)=1)/Pr(b_(j)=0))) of each bit. This is done as inPyndiah's soft decoding of block codes. For soft decoding of any bitposition b_(j), j=1, 2, . . . , (m+log₂(n_(t))), we first identify then_(t)M/2 distances (which can be called metrics) in favor b_(j)=1 andthe others in favor of b_(j)=0. Using these two groups, we find the Lvalue of b_(j).

Performance analysis of the ML detection of SM as described in theliterature identifies three components that limit ML performance. Thesecomponents are: (a) P_(signal); a probability of error component thatdepends on the signal domain which is similar to the contribution whentransmitting from a single antenna, (b) P_(spatial); a probability oferror component that depends on the spatial domain, and (c) P_(joint); ajoint probability of error component that depends on both signal andspatial domains. The lower the values of P_(signal), P_(spatial), andP_(joint), the lower the total probability of error and the higher theperformance. This analysis suggests that the signal constellations thatare used in normal communications may not be the best in SM. Instead,constellations where all the signal points have high relative amplitudesare preferred. The best such constellation is a PSK constellation whereall signal points have the same amplitude. Intuitively, the abovestatement makes perfect sense because if it is necessary to identifywhich transmitted antenna transmitted the signal, it is desirable forthat signal to have as high of an amplitude as possible. In absence ofnoise, the received signal from the transmitted antenna will be highwhile that from all the other antennas will be zero. In addition toP_(spatial) and P_(joint), it is also necessary to reduce P_(signal). Inorder to reduce P_(signal), it is necessary to maintain a high Euclideandistance of the signal constellation. Up until now, a star constellationwas shown to perform the best in SM systems as compared to other knownconstellations. However, with CICM, the present invention identifiesthat a CICM encoded PSK constellation is able to provide higherperformance than the star constellation.

CICM MIMO and CICM SM:

Next consider designing a SM system that makes use of a CTBC code andmaps the CTBC coded bits using a CICM mapping rule. The CICM interleaverrule, Γ, can be designed to maintain as high of a Euclidean distance anda symbol Hamming distance as is possible. Hence, the design of Γ alongwith a reverse Gray coded constellation mapping onto symbols of theselected constellation ends up achieving as low of a value forP_(signal) that is possible. That means the remaining aspects of theconstellation should be selected by focusing on the other twocontributions, P_(spatial) and P_(joint). However, because theCICM-mapped-PSK constellations are all constant-envelope constellations,CICM-mapped PSK constellations are also optimal in terms of P_(spatial)and P_(joint). As described above, CICM-mapped QPSK, 8-PSK, and 16-PSKhave been developed herein, and a general approach was provided toderive higher order CICM-mapped PSK constellations, e.g., 32-PSK,64-PSK, and the like.

As stated before, SM transmits bits both on the signal constellation andon the spatial constellation. Specifically, during every interval, mcoded bits are transmitted on the signal constellation whilem_(spatial)=log₂(n_(t)) bits are transmitted on the spatialconstellation. Hence, when designing Γ with CTBC codes for SM, it isfirst important to form groups of coded bits, each withm_(total)=(m+m_(spatial)) number of CTBC-encoded bits (or whatever otherunderlying code is being used to encode the bit stream). A group withm_(total) number of bits are transmitted during every interval byfeeding m number bits from it to the signal constellation and theremaining m_(spatial) number of bits to the spatial constellation. Thetask of designing Γ in SM is to (a) best form K/m_(total) number ofgroups, each with m_(total) number of bits, from the K coded bits comingout of the CTBC code, and (b) to identify which m bits in a group shouldbe fed to the signal constellation and which bits should be fed to thespatial constellation.

When dealing with only the signal constellation, Γ was designed topermute the coded bits onto symbols to maximize the symbol Hammingdistance and the Euclidean distance on the signal constellation. Hence,when designing Γ in SM, it is important to first get an idea about thesymbol Hamming distance measure and the Euclidean distance measure onthe spatial constellation. Symbol Hamming distance is straight forwardas it is equal to the minimum number of symbol intervals that thepositions of any given coded sequence listed on P(d≧d_(t)) map into.

However, the Euclidean distance measure on the spatial constellation isnot all that straightforward. In SM, during each symbol interval, oneselected antenna will effectively transmit a signal constellation point.In order to roughly estimate a Euclidean distance type measure for thespatial constellation, consider the squared distance separation betweenan antenna that transmits an energy E during a symbol interval and anantenna that stays idle is D²=E. Hence, the energy E of a signal pointcan be used as an approximate measure of the squared Euclidean distancein the spatial domain. However, the actual impact of the selectedEuclidean distance in the spatial domain is also dependent on thechannel matrix H (to include the fading model in wireless systems). WhenCTBC codes or other codes are used with the CICM-PSK constellations andmappings discussed above, this PSK modulation will maintain the highestpossible energy E for all possible transmitted symbols. The approximateEuclidean distance E on the spatial constellation is also comparablewith that in the CICM-PSK signal constellations. For example, the QPSKconstellation shown in FIG. 1 has minimum D² equal to E=2a², even thoughthe squared Euclidean distance between 00 and 01 (or 11 and 10) is 4E.Further, as explained earlier, the other higher-order PSK constellationsshown in FIGS. 3 and 4 that were systematically constructed using the4-ary PSK constellation in FIG. 1 have 4E as the highest distanceassociated with any one-bit change on the constellation and have 2E asthe lowest distance associated with any one-bit change on theconstellation. Hence, when CICM-PSK is used, the spatial domain squaredEuclidean distance measure of E=2a² is a reasonable approximation duringthe design of Γ. In other embodiments, other approximations may bealternatively used.

Because SM treats the [m+log₂(n_(t))] bits transmitted during a symbolinterval as mapping to a single SM symbol, the CICM mapping rule designalgorithm discussed above can be directly applied to design of Γ and tothe design of the mapping policy. For example, assume that 16-PSK is tobe used as the signal constellation using the reverse Gray coded mappingpolicy of FIG. 4. That is, in this example there are four bits assignedto the signal constellation portion of the SM constellation. Also assumein this simple example that there are four antennas, so that log₂(4)=2bits are used in the spatial dimension of this same SM constellation.Following the discussion of the CICM-16-PSK example above, one way todesign Γ for a SM-CICM using this 16-PSK signal constellation would beto form Γ so that its rows had the following energies [4E, 3.8478E,3.4142E, 2E, E, E], where the first four rows correspond to the signalconstellation and the last two rows correspond to the spatialconstellation. With Γ formed in this way the CICM mapping rule designalgorithm as generally depicted in FIG. 18 can be used to design Γ.

However, another variation is to split the design of Γ into two parts,design of Γ₁ for the signal constellation and design of Γ₂ for thespatial constellation. Γ₁ can be designed as an array with m rows andK/m_(total) columns while Γ₂ can be designed with m_(spatial) rows andthe same number of K/m_(total) columns. The idea is to form groups ofbits for transmission during each interval by combining columns of Γ₁with columns of Γ₂. Every group is constructed by merging one column ofΓ₁ with one column of Γ₂. Combining columns of Γ₁ and Γ₂ will merge Γ₁and Γ₂ to form the final array Γ with m_(total)=m+m_(spatial) rows andK/m_(total) columns for transmission. Designs of Γ₁ and Γ₂ are similarto the design of Γ for a signal constellation previous discussed.

For example, if the signal constellation chosen is the 4-ary PSKconstellation shown in FIG. 1, the first row of Γ₁ is preferred as itcan contribute the highest squared Euclidean distance of 4E. However,since only part of the coded bits from the CTBC code can be placed inΓ₁, coded bits of many sequences listed on P(d≧d_(t)) can have theirpositions split into Γ₁ and Γ₂. Note that every position on every row ofΓ₂ can roughly contribute to the squared Euclidean distance by an amountE, while the first and second rows of Γ₁ can contribute a squaredEuclidean distance of [4E, 2E] respectively. Hence, by following thesame design steps of Γ, we can handle any splitting of coded bits of asequence into Γ₁ and Γ₂ to maintain the highest possible d_(s,t) andD_(min) ² values. In general Γ₁ can have different Euclidean distancecontributions from different rows depending on the selectedconstellation and its bit assignments to symbols. When a PSKconstellation is used, Γ₂ will usually have the same Euclidean distancecontribution from all rows, subject to variations in the channel matrix,H. Hence, in terms of the Euclidean distance, when placing a sequence ofP(d) in Γ₁ and/or Γ₂ with the larger 16-ary PSK example described above,the first preference should be given for the first row of Γ₁. Then thepreference goes to the second row and then the third row of Γ₁, and thelast row of Γ₁ and then any row of Γ₂. As before, the goal is tomaintain the highest d_(s,t) and the highest D_(min) ² while designingΓ₁ and Γ₂.

The same steps described above and as shown in FIG. 18 can be used toconstruct Γ₁ and Γ₂. However, the construction of Γ₁ and Γ₂ separatelydoes not complete the task of constructing Γ. It will also be requiredto merge Γ₁ and Γ₂ to form the final CICM interleaver matrix, Γ, whiletrying not to lower d_(s,t) and/or D_(min) ² during the merging. Forexample, if a sequence of P(d≧d_(t)) is placed on d_(s,t1) columns of Γ₁and d_(s,t2) columns of Γ₂, the aim is to maintain a symbol Hammingdistance of (d_(s,t1)+d_(s,t2)) for that sequence during transmission.This can happen only if none of the d_(s,t1) columns occupied by a givensequence in Γ₁ are grouped with any of the d_(s,t2) columns occupied bythat sequence in Γ₂. Hence, when placing positions on Γ₁ and Γ₂, it isnecessary to keep track of the columns of Γ₂ (or Γ₁) that should beavoided when merging with a of column of Γ₁ (or Γ₂). This can be done bymonitoring the set of columns, Γ_(c)(j), occupied by all sequences ofP(d≧d_(t)) on Γ₂ (or Γ₁) that have a position placed in column j of Γ₁(or Γ₂), j=1, 2, . . . , K/m_(total). Hence, when merging columns of Γ₁with Γ₂, it is undesirable to merge column j of Γ₁ with any column of Γ₂in Γ_(c)(j). In terms of merging columns of Γ₁ and Γ₂, it is desirableto have the sets Γ_(c)(j) as empty as possible thereby making it easierto find a column from Γ₂ to merge with each column of Γ₁. In otherwords, in terms of merging, it is desirable to place sequences ofP(d≧d_(t)) mostly either in Γ₁ or in Γ₂ without splitting them too much.However, this needs to be done by maintaining the highest possibled_(s,t) and D_(min) ². Since the signal constellation offers a higherEuclidean distance, one good strategy is to try to place shortersequences of P(d≧d_(t)) mostly in Γ₁ and longer sequences of P(d≧d_(t))(which can have more Euclidean distance contributions) mostly in Γ₂. Inlight of merging of columns of Γ₁ and Γ₂, if necessary, we may choose tochange the general steps of construction of Γ described above.Specifically, in the general steps 2 and 3, the most popular positionswere placed along the same columns of Γ. Even though it still makessense to follow the same approach in the design of Γ₁ and Γ₂, this maycreate problems later in the merging stage. Hence, if the CICM mappingrule design algorithm as described above fails in the merging stagebecause proper combinations of columns from Γ₁ and Γ₂ cannot be found,it is first desirable to go back to the designs of Γ₁ and Γ₂ and relaxthat condition of forcing most popular positions to be placed into thesame columns as possible until valid combinations of columns can befound in the merging stage. In failing, lower d_(s,t) and/or D_(min) ²and continue searching and performing roll-backs and swaps as necessary.Upon successful completion, the interleavers from Γ₁ and Γ₂ will havebeen merged in such a way as to form a final signal-spatial CICMinterleaver rule Γ, that can be used along with the signal mappingpolicy and the antenna selection policy to transmit a symbol from thesignal constellation via a selected one of the n_(t) antennas.

CICM-MIMO-SM Multi-Antenna Embodiment:

Referring now to FIG. 21, a transmitter, a channel and a receiver forCICM based MIMO spatial modulation (SM) communication is illustrated inblock diagram form. The transmitter and receiver depicted in FIG. 21 canbe embodied as a method, an apparatus, a device, or a system. Aconstellation and spatial mapper Γ 2105 is provided to receive a bitstream with is presented to the block 2105 on the input arrow to theleft. The bit stream can be a CTBC encoded bit stream, for example. Asdiscussed earlier in connection with CICM, the input bit stream can beany coded bit stream for which a set of tables P(d), for d=d_(t),d_(t)+1, . . . , d_(f) can be constructed. This would include blockcodes, convolutional codes, turbo product codes, and other codes likeselected turbo codes and LDPC codes for which these tables can beconstructed.

The output of the constellation and spatial mapper Γ 2105 is typicallysent to a set of radio frequency circuits which include modulatorcircuits, transmitter amplifiers, and n_(t) different transmit antennas.These transmit antennas are represented as the triangles to the rightside of the block 2105. In a typical SM-CICM embodiment only onetransmit antenna is active at any given time. The input coded bit streamis separated into groups of [m+log₂(n_(t))] bits, and during each symbolinterval, m of those bits are used to select a signal constellationpoint and the remaining log₂(n_(t)) bits are used to select a transmitantenna. In the example of CICM-16-PSK and four transmit antennas, thematrix Γ will have [m+log₂(n_(t))] rows and K/(m+log₂(n_(t))) columns.

The ultimate output of the block 2105 is SM transmission signal that issent out over the n_(t) different antennas as a function of time and theinput coded bit stream. This multi-antenna output is then processed viaa channel matrix, H. The channel matrix H is actually a mathematicalrepresentation of a combination of transmit and receive signalprocessing in addition to a stationary or time-varying fading channel.All of these channel effects are termed the channel matrix, H, herein.In practice the channel matrix, H, is embodied as a physical multipleinput-multiple output transmission channel. The output of the channelmatrix, H, is coupled to a multi-antenna receiver front end 2115. Themulti-antenna receiver front end 2115 is coupled to receive the outputof the channel matrix, H. The multi-antenna receiver front end 2115 thenperforms front end receiver processing and baseband processing in orderto provide a detection signal. In practice the detection signal isdigitized on the I and Q channels and is then processed to form a set ofbit metrics that are to be used in conjunction with a SISO decoder 2125.

The output, typically in the form of computed bit metrics, is passed viaa CICM deinterleaver block 2120 to the SISO decoder 2125. As the SISOdecoder 2125 performs SISO iterations, extrinsic information will beupdated. When every SISO iteration-completes, new updated bit metricswill be needed. Hence the SISO decoder sends a subset of its extrinsicinformation via a CICM interleaver 2130 back to the multi-antennareceiver front end 2115 (a memory structure therein that holdsinformation associated with the received digitized I/Q signal pointsderived from the multi-antenna receiver front end 2115). The updated bitmetrics are derived using the information associated with the receivedsignal points from the multi-antenna receiver front end 2115 and theavailable extrinsic information. The updated bit metrics are sent viathe CICM interleaver 2120 back to the SISO decoder 2125. SISO iterationscontinue in this way until the coded bit stream has been decoded and theoriginal information bits become available. The output of the SISOdecoder is then coupled from the output arrow to the right of the SISOdecoder block 2125.

It can be noted that the system 2100 has many key advantages. First ofall, in embodiments that use CICM-PSK, the constellation is constantenvelope and is able to accommodate multiple bits while maintaining ahigh MSED at the coded sequence or codeword level. Secondly, theconstellation and spatial mapper Γ 2105 will maintain as high of asymbol Hamming distance as is possible. Thirdly, because the PSKconstellation is constant envelope, all signal points will have the sameamplitude, and this will cause the performance of the spatialconstellation aspects of the SM constellation to be maximized. If CTBCcoding is used, the SISO decoder will reap all of the benefits discussedabove in connection with CTBC encoding and decoding. Also, as in SMsystems, because only one transmit antenna is active at any given time,the receiver complexity is greatly reduced relative to traditional MIMOsystems.

It should be noted that aspects of the present invention canalternatively be used with MIMO modulation rules that are used totransmit a plurality of different constellation points through aplurality of different spatial channels simultaneously. For example, thesystem 2100 can be embodied to use GSM and MIMO systems such as V-BLAST,D-BLAST and the like. During each symbol interval, two up to n_(t)number of CICM-mapped symbols (such as CICM-mapped 16-PSK) can betransmitted via n_(t) number of separate antennas. In such systems, theclass of CICM-PSK type constellations are considered to be optimalbecause all of the signal points have the same highest energy values.

In such systems, to maintain a system-wide symbol Hamming distance, theCICM mapping rule design algorithm can be designed to avoid allowingmore than one antenna to transmit a bit from a given low weight sequenceduring a given symbol interval. In such cases, a single CICM permutationmatrix, Γ, is designed with between two and m*n_(t) number of columns,depending on the number of signal points that will be simultaneouslytransmitted in a single symbol interval. In full V-BLAST/D-BLAST typeembodiments there will be n_(t) number of columns that correspond toeach Euclidian distance. The symbol Hamming distance can be computed asonly be effective within each one of the separate n_(t) differentchannels, or can be considered per symbol interval. In such embodiments,a single CICM permutation matrix, Γ, can be designed that has m rows andK/m columns as in the single-channel case. Now, however, a set of up ton_(t) columns of Γ will be mapped to separate antenna channels duringeach symbol interval.

For use in MIMO modulation embodiments of the system 2100 where morethan one antenna transmitting at the same time, e.g., STSK, GSM, V-BLASTand D-BLAST systems, the SISO decoder/interference canceller 2125 ispreferably implemented to detect the multiple symbols that weretransmitted from different antennas at the same time. In suchembodiments, the SISO decoder/interference canceller block 2125 can beconfigured to perform ML estimation or can be augmented to additionallyperform interference cancellation type functions as described below.

The block 2125 can be configured to perform optimal maximum likelihood(ML) detection. ML detection detects all streams jointly by searchingover all possible x vectors to determine the best estimate of x,{circumflex over (x)}, that minimizes ∥y−H{circumflex over (x)}∥², whichis equivalent to minimizing the Euclidean distance. Hence, the MLdecision rule is to find the vector, {circumflex over (x)}, that solves

min∥y−H{circumflex over (x)}∥ ².  (32)

Since the above minimization requires checking M^(n) ^(t) Euclideandistances, even in the uncoded case, the complexity of ML detectionincreases rapidly as the number of antennas increases.

The block 2125 can be also be configured to function as a lineardecorrelator detector. This technique detects streams individually. Fromequation (31) it can be seen that every element of y (signal received byany single received antenna) has contributions from every signaltransmitted from every transmitted antenna. Hence, when detecting s_(j)from antenna j, it is necessary to remove the interference on y causedby the signals transmitted from all other antennas. The removal ofinterference is done by decorrelation and the decorrelation can be doneby using a transformation on y. Specifically, when detecting s_(j), thedecorrelator maps y on to a space that is orthogonal to h₁, h₂, . . . ,h_(j−1), h_(j+1), . . . , h_(nt), to form a new signal y′. As a result,the mapped signal y′ does not have any interference from signalstransmitted from any antenna other than the desired signal transmittedfrom the j^(th) antenna. The mapped signal y′ is then passed through amatched filter to detect s_(j). The combination of the mapper that mapsy to y′ and the following matched filter is the decorrelator. Thedecorrelator detector consists of a bank of n_(t) decorrelators, withone for each antenna j, j=1, 2, . . . , n_(t).

The block 2125 can also be configured to perform Successive InterferenceCancellation (SIC) coupled with a decorrelator. In this method alreadydecoded symbols are used to cancel out the interference caused by thealready decoded symbols on a symbol that is currently being decoded.When decoding s₁ through s, in a predetermined order, decoding of s_(j)can be assisted by removing the interference caused by s₁ throughs_(j−1) on y. This is done before the mapping of y to y′ thereby makingthe mapping process easier.

The block 2125 can also be configured to act as a minimum mean squarederror (MMSE) Receiver. The above described decorrelator performs well athigh SNR when the interference is dominant, but it does not perform wellat low SNR when the noise is dominant. Hence, in order to perform wellat all SNR values, each decorrelator in the receiver can be replaced bya MMSE receiver. The MMSE receiver can be constructed as atransformation of y using the MIMO channel matrix, H. The block 2125 canalso be configured to act as an MMSE receiver combined with SIC. This isvery similar to the SIC described above, with the only difference thateach decorrelator is replaced by a MMSE receiver discussed above. Inaddition, the block 2125 can also be configured to perform other MIMOdetection algorithms such as sphere detection (SD). SD is asimplification of ML detection that limits the search to a sphere aroundthe received vector y. Other techniques that could be implemented in theblock 2115 include a developed matched filter (DMF) as is known in theart and signal vector based decoding (SVD) as is also know to those ofskill in the art.

SIC plays an important role in the detection of the individual datastreams associated with each component of the vectors x and y. However,it is known that SIC can introduce error propagation by passingincorrectly decoded symbols of different antennas for the detection ofthe signals on successive antennas. An aspect of the present inventionis based on the observation that the type of error propagation thatoccurs in V-BLAST and D-BLAST in MIMO transmission systems isstructurally similar to the error propagation that occurs in themulti-stage decoding (MSD) of multi-level codes (MLCs). U.S. Pat. No.8,532,229, “Hard iterative decoder for multilevel codes” to E. M.Dowling and J. P. Fonseka (“the Dowling reference”) describes a harditerative decoding (IHID) technique that improves upon MSD decoding.U.S. Pat. No. 8,532,229 is incorporated by reference herein to providethe full details of how to implement the IHID algorithm in the MIMOreceiver algorithms and system described below which can be implementedin the block 2125 of the system 2100.

An aspect of the present invention is to first follow the steps of SIC(such as an MMSE based approach that uses SIC as described above). Startby using the IHID algorithm decoding at least a subset of signals fromantenna 1, and using the symbol decisions from antenna 1, remove/cancelthe interference caused by the signal on antenna 1 while decoding thesignal on antenna 2. Next use the IHID algorithm decode the signal onantenna 2 and using the symbol decisions from antennas 1 and 2 toremove/cancel the interference caused by antennas 1 and 2 while decodingthe signal on antenna 3. This process can be continued until the signalon the last antenna n_(t) is decoded by removing interference frompreviously decoded signals on antennas 1 through (i−1) while decoding ofthe signal on antenna i. In addition, as in IHID, once the normal SICsteps are complete, loop back to antenna 1 with all the currently knowninformation about decoded signals and repeat the process. In the secondpass through the loop, any or all of the n_(t)−1 decoded symbols fromthe previous pass through the loop can be used to remove interferencefrom an antenna's signal stream that is currently being decoded. Thatis, in the IHID based SIC approach, hard decisions of the decodedsymbols on a given antenna can be used to remove the interference causedby those symbols on the remaining antennas. If the transmitter staggeredthe transmission of the signals on the different antennas during a startup phase similar to D-BLAST, the earlier iterations can process fewersymbols than the later iterations. This IHID based interferencecancellation and decoding algorithm is continued by repeating the SICsteps in an iterative manner. This process can be stopped as soon as nochange is seen in the decoded sequences on all antennas.

Soft interference cancellation with SISO decoding works because after aninitial number of SISO iterations, the correct decoded sequence willbegin to emerge. At that point, using the received signal and theregenerated signal based on the currently decoded message, theinterference and the level of interference can be estimated. Thisestimated interference can be used to cancel out the interference forthe next iteration. In general, interference can come from other sourcessuch as ISI, IQ imbalance, and polarization interference in optics.Also, soft interference cancellation can be used to estimate the carrierphase in a non-coherent system for use during joint soft decoding andcarrier phase tracking. In soft interference cancellation with softdecoding, the interference and the level of interference are estimatedand updated and used to perform interference cancellation for us in eachSISO iteration.

For use when CTBC codes or other codes that are soft decoded, thepresent invention contemplates methods, apparatus and systems for softinterference cancellation to be used with soft iterative decoding.Consider an example where a CTBC code is constructed using either CI-3or CI-4 and is then mapped on to symbols using a CICM interleaver rule,Γ. In this example, there will be K/m number of 2^(m)-ary symbols readyfor transmission. In accordance with CICM based MIMO transmission of thepresent invention, split these K/m symbols into K/(mn_(t)) segments ofsymbols and feed those segments one by one to each antenna. Thesesegments can be formed by simply dividing the symbol sequence in anorderly manner starting from the first symbol. With this construction,there will be a set of data streams available, placed vertically onebelow the other in time, as in V-BLAST. These segments can then besimultaneously transmitted from the respective antennas. In effect all Kbits of the CTBC coded frame are transmitted during K/(mn_(t)) intervalsachieving the same data rate of V-BLAST. However, unlike V-BLAST andsimilar to D-BLAST, the above scheme has coding across different datastreams.

In the decoding, bit metrics related to coded bits of the CTBC codesneed to be extracted from the received vector y. This can be done bymodifying the last step of any of the above receivers to extract softinformation or by using any other known soft detection method describedin the literature for MIMO applications. Next run the first SISOiteration on a frame of the CTBC code. At that point there will be thelog likelihood ratio values (L values) of each coded bit. These L valuesindicate the best estimates of the bit values (1 or 0) of the CTBC codedsequence along with the reliabilities (which are the probabilities ofthese bit value decisions). Using the L values of these coded bits, nextidentify the corresponding decoded symbols 1 through K/m and theprobability that the decision on each of those symbols is correct. Atthis point, based on the current information, the algorithm hasidentified each of the most likely symbols transmitted from each antennaand their probabilities. In any normal iterative decoding process thatinvolves higher-order symbols (m>1) and has no interference, theseprobabilities can be used to better estimate the bit metrics from thereceived signal. However, in MIMO systems, since inter-antennainterference is present, the present invention introduces an additionalstep to be used to remove the interference in a soft manner. This isaccomplished using the estimated probabilities of the symbols beforeupdating the bit metrics. In the above IHID based SIC approach, harddecisions of the decoded symbols on a given antenna were used to removethe interference caused by those symbols on the remaining antennas. Insoft decoding, the probabilities of the decisions of the decoded symbolsis also available. Therefore, the interference caused by these softdecoded symbols can be removed/cancelled in a soft manner by using theestimated probabilities of the symbols.

Specifically, in soft interference cancellation, the interference fromevery symbol is first calculated as any of the SIC approaches describedabove, and is then multiplied by the probability of that symbol foundfrom the L values to estimate the interference contribution from thatsymbol. If sm is the symbol having the highest L value, thenL=log(P(symbol=sm)/(1−P(symbol=sm)), therefore P(symbol=sm) can beeasily found from the L value. Next, after the soft interferencecancellation operation, all interference contributions can be subtractedto update the bit metrics using the signal constellation. The updatedbit metrics can then be used for the next iteration. Since the decisionsmade at the beginning of the iterations can be rather unreliable, thesoft interference cancellation procedure can be started after apreselected initial number of iterations, n_(init). As the iterationsproceed, they will typically converge to a solution and the reliabilityof most symbols will become high and the soft interference cancellationwill be similar to the above described SIC solutions.

Joint soft interference cancellation and soft decoding initiallyperforms one or more soft decoding iterations to initially estimate aset of interference parameters. In some embodiments, the initialinterference estimate for use in a current frame of data can be basedupon interference parameters estimates from the immediately precedingframe of data. Once the initial interference cancellation parameters areavailable, and from then forward, soft interference cancellationsubtracts an estimate of the interference from the received signal toperform a current SISO iteration. The estimate of the interference willbe based upon information the previous SISO iteration.

Consider an example that involves the transmission of CTBC signals usingthe QPSK constellation in FIG. 15. In this example, assume that there isa non-zero I/Q imbalance. The received signal on the I and Q channelsduring any k^(th) interval can be written as

y _(I)(k)=α_(I) *a _(I)(k)+β_(Q) *a _(Q)(k)+n _(I)(k)  (33a)

and

y _(Q)(k)=α_(Q) *a _(Q)(k)+β₁ *a _(I)(k)+n _(Q)(k)  (33b)

for k=1, 2 . . . K/2, where, (a_(I)(k),a_(Q)(k))=(±a,±a) represents thetransmitted symbol, α_(I) and α_(Q) are interference parameters thataccount for amplitude distortion of the I and Q signal components(diagonal components of a 2×2 rotation/distortion matrix), β_(I) andβ_(Q) are interference parameters that account for the interference fromthe I channel to the Q channel and from the Q channel to the I-channelrespectively (off-diagonal components of the 2×2 rotation/distortionmatrix), and n_(I)(k) and n_(Q)(k) are the I and Q channel noisecomponents. Due to the 2×2 distortion matrix, even in absence of noise,the received signal, (y_(I)(k) y_(Q)(k)), will not necessarily match thetransmitted sequence, (a_(I)(k), a_(Q)(k)).

Initially, the SISO iterations can start off by assuming thatα_(I)=α_(Q)=1 and β_(I)=β_(Q)=0. After the decoded sequence starts toemerge, estimates for a_(I)(k) and a_(Q)(k), for all k become available.As the estimates for a_(I)(k) and a_(Q)(k) become more reliable,equations (3a) and (3b) can be used to estimate α, and β. Uponestimating, α and β values, the y_(I)(k) and y_(Q)(k) estimates can bemodified to cancel out the I/Q imbalance and to thereby form still morereliable estimates for (a_(I)(k), a_(Q)(k)). These improved estimatescan then be used to calculate the bit metrics for the next iteration. Ifdesired, the α, and β values can be estimated every iteration or onceevery few iterations. Hence, in this example the joint soft interferencecancellation and soft decoding forms initial estimates of thetransmitted sequence using soft decoding, estimates the interference dueto the I/Q imbalance, cancels the interference, and then continues toiteratively improve the reliability of both the interference estimatesand SISO decoded bit stream until convergence.

In situations where the some or many decoded symbols have lowprobabilities, only the intervals that have higher probabilities of thedecoded symbols can be made to contribute significantly to the estimatesformed in equations (33a) and (33b). In some embodiments the α and βparameters of the 2×2 distortion matrix of (33a) and (33b) preferablyare calculated/updated based only upon subset of decoded symbols whosereliabilities are above some threshold or relative measure.

The algorithm described above to cancel I/Q imbalance distortion can beviewed as a hard interference cancellation approach because the harddecoded symbols (a_(I)(k), a_(Q)(k)) are used in equations (33a) and(33b). A soft interference cancellation approach can be obtained byreplacing a_(I)(k) and a_(Q)(k) in equations (3a) and (3b) by

${a_{l}^{\prime}(k)} = {\sum\limits_{i = 1}^{M}{{p\left( {i,k} \right)}{a_{l}(i)}\mspace{14mu} {and}}}$${{a_{Q}^{\prime}(k)} = {\sum\limits_{i = 1}^{M}{{p\left( {i,k} \right)}{a_{Q}(i)}}}},$

where p(i,k) represents the probability of symbol i during symbolinterval k. This way, all M symbols are taken into account each symbolinterval in accordance to their respective probabilities. However, asthe SISO iterations converge to a solution, these summations willconverge to the contribution from only the correct symbol. This use ofsoft decoded data estimates with their probabilities are used in manypreferred embodiments.

Referring to FIG. 22, a joint interference cancellation and SISOdecoding method, apparatus and system 2200 are illustrated in blockdiagram form. A sequence of received signal points y, is received from achannel and is used to calculate an initial set of bit metrics in theblock 2205. One of more initial SISO iterations are carried out (e.g.,in blocks 2210, 2215, 2220, 2225) as discussed in further detail belowand the initial bit metrics can be updated each time one of theseinitial SISO iterations is completed. After the one or more initial SISOiterations have been computed, joint SISO iterations with softinterference cancellation are allowed to begin.

During joint SISO iterations with soft interference cancellation, ablock 2210 performs soft decoding using a modified set of bit metricsthat have been updated in a block 2255. These modified bit metrics arethen processed through a SISO half iteration involving the inner code inthe block 2210. When CTBC codes are in use, the output of the block 2210is then deinterleaved in accordance with a CI-2, CI-3 or CI-4 or anyother constrained interleaver that implements a set of constraints asneeded to support the underlying CTBC code. The de-interleaved sequenceis then coupled to a block 2220 that performs soft decoding inaccordance with the outer code. The soft decoded outputs of the block2220 are then coupled to a block 2225 that performs de-interleaving andthe deinterleaved sequence is fed to a block 2230 that performs softdecoding in accordance with the outer code, and which may be a softwaresubstantiation that shares some or all of the same hardware as the block2210. The output of the inner decoded sequence is then passed to a block2235 which symbol estimates along with their probabilities. Theprobabilities of all M symbols are calculated during each interval, andthis calculation can be based on the likelihood (L) values of the codedbits generated during the SISO iterations.

Once the symbol estimates and their probabilities are known, hard typedecisions (a_(I)(k), a_(Q)(k)) or soft type decisions ((a_(I)′(k),a_(Q)′(k)) can be made similar to those discussed in connection withequations (33a) and (33b). In a block 2247 certain operations can beperformed every iteration or every couple or few iterations, dependingon the embodiment and signal conditions. In a block 2240 interferenceparameters are calculated. For example, depending on the embodiment, theinterference parameters could be the components of a 2×2rotation/distortion matrix as in the I/Q imbalance correction example ofequation (3) or in polarized channels type embodiments where channelimperfections cause the horizontal and vertical polarized channels tohave a degree of cross talk. Other examples include V-BLAST, D-BLAST,GSM and other MIMO communications systems where there is more than oneactive transmitter at any given time. In such embodiments, theinterference parameters computed in the block 2240 will be used tocancel interference due to other simultaneously transmitted channels(off-diagonal terms in an n_(r)×n_(r) inter-channel distortion matrix)from a selected channel (on-diagonal terms in the n_(r)×n_(r)inter-channel distortion matrix.) If desired, only a subset of intervalsthat have higher probabilities for the decoded symbols can be used toestimate the interference parameters. Once the interference parametersare available, the block 2245 is used to compute a newinterference-cancelled signal estimate vector, y′. The signal estimatevector, y′ is then stored in memory in a block 2250. The sequence y′along with the estimated probabilities of the symbols is then used tocompute a set of interference-cancelled bit metrics in the block 2255.The output of the block 2255 is then used in the next SISO iteration.

In many prior systems, interference cancellation is performed usinginterference cancellation parameters have been estimated during previousframes. However, the method 2200 estimates the parameters based uponinformation from the frame being currently decoded. Even if there areslow time variations in the parameters inside of the current frame, themethod 2200 will be able to track those slowly time-varying interferencecancellation parameters. Since the number of estimated parameters aretypically low, the method/system 2200 can be modified for calculatinginterference cancelation parameters for a variety of different types ofinterference during that occurs in the same frame being soft decoded.For example, the approach 2200 can be used in partially coherent ornon-coherent systems to jointly perform SISO decoding and carrier phaserecovery.

In many embodiments, the estimates that were made in the previous framecan also be used to provide a set of starting parameters to be used inthe beginning of the current frame. That is, information from blocks2240 and/or 2245 from the previous frame can be provided to block 2205to start off the iterations in the current frame, v, using a vector y′based upon the received signal information in the current frame and theinterference cancellation parameters computed based on information fromthe previous frame.

A main benefit of the SISO decoder/soft interference cancellation isthat the block 2125 of FIG. 21 can be implemented in an efficientmanner. The SISO decoding used to decode the underlying code (e.g., suchas a CTBC code that has been CICM-mapped for transmission using n_(t)number of antennas that transmit in parallel) can be SISO decoded usinga SISO decoder that is augmented with a soft interference canceller. Thesoft interference canceller makes use of the same bit metrics andlikelihood values that are used by the SISO decoder. The softinformation used in the SISO decoder is related to the separateinformation streams that have been transmitted via n_(t) differentantennas in the same symbol intervals. This soft information is alsoused by the soft interference canceller in the block 2125. The SISOdecoder and the soft interference canceller to provide an integrated andseamless technique to converge to a MIMO solution using soft data.

The above soft interference cancellation approach 2200, the block 2125can also be applied to other forms of soft interference cancellationthat do not involve multi-antenna MIMO systems. For example, the softinterference cancelling approach of the present invention can be appliedin systems where there are other forms of MIMO processing and multipledata streams. Consider a specific example where a single antennatransmits on both the horizontal and vertical polarization. In such acase the channel can introduce a rotation so that the horizontal andvertical polarization channels interfere with one another. In such anexample, the same above described soft interference cancellationtechnique could be used to cancel the interference between thehorizontal and vertical polarizations as an integral part of softdecoding with soft interference cancellation. Themethod/apparatus/system 2200 can be applied to many other kinds of codesbeside CTBC codes. This would include other types of seriallyconcatenated codes, parallel concatenated codes such as turbo codes,convolutional codes, block codes, or generally any kind of code that issoft decoded using a SISO decoder. Hence it is to be understood that thesoft interference cancellation technique described above could beapplied to a variety of different communications systems where there ismore than one data stream being sent simultaneously, there is cross talkbetween channels, and the channels are encoded in such a way that a SISOdecoder is located in a receiver that is designed to decode theplurality of received signals.

CICM Spatial Modulation OTN Embodiment:

Referring now to FIG. 23, a transmitter, a channel and a receiver forCICM based optical spatial modulation (OSM) communication is illustratedin block diagram form. The transmitter and receiver depicted in FIG. 23can be separately or jointly embodied as one or more methods, apparatus,devices, or systems. As is true generally with all block diagramsherein, in certain cases more than one blocks could be implemented on asingle substrate or enclosure, and any one block could be implemented onmore than one substrate or enclosure. A laser 2305 provides an opticalcarrier wave for optical communications. In many practical embodiments,where dense wavelength division multiplexing (DWDM) is used, the laser2305 and the entire system 2300 would be repeated for every opticalchannel in the DWDM channel bank. Also, the entire system 2300 would berepeated for each of the horizontal and vertical polarizations in thefiber. For example, if the DWDM system had 80 channels, then the laser2305 and the system 2300 would be repeated 160 times, once at each ofthe 80 wavelengths, and once for horizontal and vertical polarizationsat each wavelength. From here forward, the system 2300 will be describedat a single wavelength, and a single polarization, knowing that thefollowing discussion of the system 2300 could be repeated at eachwavelength and/or each polarization.

A constellation and spatial mapper

$\Gamma = \begin{bmatrix}\Gamma_{1} \\\Gamma_{2}\end{bmatrix}$

2310 is provided to receive an input bit stream which is presented tothe block 2310 on the input arrow to the left. For example, the inputbit stream can be a CTBC encoded bit stream. As discussed earlier inconnection with CICM, the input bit stream can be any coded bit streamfor which a set of tables P(d), for d=d_(t), d_(t)+1, . . . , d_(f) canbe constructed. This would include, in addition to CTBC codes, blockcodes, convolutional codes, turbo product codes, and other codes likeselected turbo codes and LDPC codes for which these tables can beconstructed.

The output of the laser 2305 couples to a first input of an opticalmodulator 2315. The optical modulator receives at a second input m bitsrepresentative of a signal constellation point input that tells theoptical modulator how to modulate the laser input to produce a modulatedlaser output. The modulation is performed in accordance with the signalconstellation point supplied as a column of the submatrix Γ₁ by theconstellation and spatial mapper 2310. For example, when m=4, eachcolumn of the submatrix Γ₁ could identify a four coded bits thatidentify a 16-PSK signal point to which the four coded bits will bemapped in a given symbol interval. In this example a particular CTBCcode is used to encoded the bit stream input to the block 2310, and thenencoded bit stream is mapped to a sequence of constellation points usinga CICM-16-PSK mapping as previously described.

The modulated laser output of the optical modulator 2315 is coupled to asingle input multiple output (SIMO) active optical filter bank/combiner2320. The SIMO active optical filter bank/combiner 2320 receives aspatial constellation point input that tells the SIMO active opticalfilter bank how to configure its SIMO transfer function so that thesingle input containing the output of the optical modulator 2315 iscoupled through one of n_(t) internal optical signature filters thatexist inside the SIMO active optical filter bank. The outputs of thedifferent n_(t) internal optical signature filters within the SIMOactive optical filter bank is sent to a combiner. The combiner can beimplemented using known optical technology to include merging opticalpaths in an optical integrated circuit or fiber couplers that havemultiple optical input fibers which are length matched and combined toform a single output.

The selection of which one of the n_(t) optical signature filtersthrough which the modulated laser signal is coupled is performed inaccordance with the log₂(n_(t)) bits that correspond to a spatialconstellation point supplied as a column of the submatrix Γ₂. Forexample, if there are sixteen possible optical signature filters insideof the SIMO active optical filter bank, then n_(t)=16 and the submatrixΓ₂ will have log₂(n_(t))=4 bits per column. Therefore, in each symbolinterval, m=4 bits will be coupled from a selected column of thesubmatrix Γ₁ to the optical modulator to identify a 16-PSK signal point,and log₂(n_(t))=4 bits will be mapped from the same column of Γ₂ toidentify the selected one of the n_(t)=4 optical signature filtersthrough which the modulated laser signal will be coupled during thatsame symbol interval. Note in this example where sixteen selectableoptical signature filters exist within the SIMO active optical filterbank, that the line rate is doubled over what is sent by the CICM-16-PSKportion, because in addition to the four bits sent each symbol intervalto select a 16-PSK constellation point, four more bits are sent eachsymbol interval to select a spatial constellation point (i.e., to selectan optical signature filter through which to couple the 16-PSK modulatedlaser signal). The output of the SIMO active optical filterbank/combiner 2320 is output to an optical channel. The optical channelis typically implemented as a fiber optic communication channel,although free space laser communication channels could also be used.

To better understand the structure and function of the SIMO activeoptical filter bank/combiner 2320, let the output of the opticalmodulator 2315 be denoted as s(t), let the vector x=e_(i) be a standardunit basis vector of all zeros except for a “1” in the i^(th) component,and let H_(t) be a MIMO channel sub-matrix associated with a singlesymbol interval, then the output of the SIMO active optical filter bank,y_(t)(t)εC^(n) ^(i) can be written as,

y _(t)(t)=H _(t) xs(t).  (34)

The combiner effectively creates a single output signal s_(t)(t) tocouple to and through the optical channel. The signal s_(t)(t) iscreated by summing all of the elements of the vector signal y_(t)(t) ateach point in time. Because the vector x is equal to a standard unitbasis vector, e_(i), where the subscript, i, corresponds to a currentlyselected one of the possible log₂(n_(t)) spatial constellation indices,the output of the combiner will be equal to the signal s(t) convolvedthrough the i^(th) optical signature filter transfer function. In thismodel, it can be noted that the H_(t) sub-matrix can be described as amatrix whose elements correspond to filter transfer functions. Thesetransfer functions are optical transfer functions and are applied duringeach symbol interval. As the constellation and spatial mapper 2310outputs each new pair of signal and spatial constellation points, a newcoherently modulated laser signal s(t) is generated, and the spatialconstellation point is mapped to a selected index, i, for thecorresponding symbol interval, and the output of the SIMO active opticalfilter bank/combiner 2320 corresponds to the i^(th)optical-signature-filtered and modulated laser signal, s_(t)(t).

Each of the active optical signature filters can be implemented inaccordance with known technology, such as by using optical integratedcircuit technology or fiber gratings and the like. For further detailsof the technology used to design and implement active optical filters,see U.S. Pat. No. 6,687,461: “Active optical lattice filters,” D. L.MacFarlane and E. M. Dowling, and U.S. Pat. No. 7,042,657: “Filter forselectively processing optical and other signals,” D. L. MacFarlane bothof which are incorporated by reference herein. In addition to the activecomponents, which can include semiconductor optical amplifier regions(SOARs), the active optical signature filters can be designed to includepassive optical filter sections. To see a number of optical filterarchitectures that are known to those of skill in the art and can beused inside the SIMO active optical filter bank, also see C. K. Madsenand J. H. Zhao “Optical filter design and analysis: a signal processingapproach,” Wiley, 1999 (“the Madsen reference”).

The SIMO active optical filter bank/combiner 2320 will include a set ofactive components which can include voltage controlled reflectioncoefficients and voltage-controlled SOARs, for example. These activecomponents can be used to alter the transfer functions of the opticalsignature filters inside of the SIMO active optical filter bank. Forexample, at the line rate, and in response to the log₂(n_(t)) bits thatcorrespond to a spatial constellation point supplied as a column of thesubmatrix Γ₂, the active components can be used to cause the opticallymodulated laser signal from the block 2315 to be coupled through aselected one of the optical signature filters that are otherwiseimplemented using passive optical components. In other embodiments, theactive components in the SIMO active optical filter bank can be used toalter the transfer function of an active optical filter to select one ormore transfer functions of one or more corresponding optical signaturefilters. In other embodiments, the gains of certain particular SOARscould be used to select a sub-bank in the filter bank, and then insidethat sub-bank, a single active optical filter could be responsive to oneor more sub-components the spatial constellation point to select from aplurality of pre-designated optical signature filter transfer functionsthat can be realized by the single active optical filter to realize asubset of transfer functions associated with the sub-bank.

As discussed in the Madsen reference, optical filters can be designedusing multi-stage moving average (MA), multi-stage auto-regressive (AR)and multi-stage auto-regressive moving average (ARMA) basedarchitectures. MA filters are also known as FIR (finite impulseresponse) filters, AR filters are also known as all pole IIR (infiniteimpulse response filters, and ARMA filters are also known as IIR filterswith arbitrary poles and zeros. Therefore, any of the n_(t) differentoptical signature filters in the SIMO active optical filterbank/combiner 2320 can be implemented as any of these filter types.Other filter types are known such as ring resonators and multi-portcouplers, 2D lattice filters, N×M 2D-Lattice filters, higher dimensionallattice filters, 2D active lattice filters, and such architectures couldalso be used to construct the entire SIMO active optical filterbank/combiner 2320, or sub-portions thereof.

Each optical signature filter used within the SIMO active optical filterbank/combiner 2320 can be designed to form a portion of the informationcontained in the channel matrix, H. It can be desirable to design theoptical signature filters to be an orthogonal basis set. For example,the filter bank may preferably constructed using discrete-time opticalFIR filters that are preferably implemented using a multistage MAarchitecture as described in the Madsen reference. In such an example,if the filter coefficients of the FIR filters form a set of orthogonalbasis vectors, then the optical signature filters correspond to a set oforthogonal filters. However, as discussed below, the total channelmatrix, H, can include additional filters at the receiver, potentiallymatched to those in the transmitter, in which case, the combination ofthe transmit and receive filter banks may be designed to be anorthogonal basis set. Also, for example, if space time shift keying(STSK) is being used in the system 2300, then the optical signaturefilters used within the SIMO active optical filter bank/combiner 2325could implement a portion of the dispersion matrix associated with eachoptical signature filter in the filter bank 2320. Likewise, instead ofhaving the signature filters within the discrete-time optical filterbank implement an orthonormal basis set, it is possible to use differenttypes of optical filters such as an all-pole optical lattice filters asdescribed in chapter 5 of the Madsen reference. Such filters are easy toimplement and could be used to provide a significant amount of signalseparation as opposed to orthonormality. This type of design could leadto more compact and efficient optical signature filter banks at theexpense of the SISO decoder or other related signal processing hardwareto have to work harder. That is, it is not required to implement anorthonormal basis set of filters, nor an approximation thereto. All thatis really needed is to ensure that the filters are selected so that theoverall channel matrix H is invertible, full rank, or has enough rank sothat the SM-modulated or MIMO-modulated signals can be suitablyrecovered/reconstructed after passing through the channel H. Thetransmit and receive filters will influence H as will any noise anddistortion effects of the optical channel itself.

The output of the optical communication channel is coupled to a receiversubsystem whose front end comprises a single input multiple output(SIMO) active optical receive filter bank 2330. The SIMO active opticalreceive filter bank 2330 includes internal active optical componentsthat effectively splits the received signal into n_(r) differentreceiver channels. In preferred embodiments of the system 2300, at thistime, it is deemed desirable to set n_(r)=n_(t). For example, if theSIMO active optical receive filter bank 2330 includes n_(r)=n_(t)=16internal receive filters, then a single input, multiple output SOAR canbe designed to distribute the single input from the optical channel tothe inputs of a set of n_(r) optical receive filters arranged into aparallel filter bank. While an architecture involving a SIMO SOAR thatdistributes the optical receive signal from the optical channel to n_(r)different optical receive filters arranged in parallel can be desirable,other optical filter architectures could alternatively be used. Forexample 2D active optical lattice filters, N×M 2D active optical latticefilters, higher dimensional active optical lattice filters, or otherarchitectures such as SIMO optical ring resonators and the like could beused to implement the voltage-controlled SIMO transfer function of theSIMO active optical receive filter bank 2330. The SIMO transfer functioncan also be viewed as a set of transfer functions in parallel from thesingle input to the multiple outputs.

It should be noted that the combination of the SIMO active opticalfilter bank/combiner 2320 and the SIMO active optical receive filterbank 2330 collectively provide an optical computingstructure/architecture to emulate/perform the mathematical operation ofthe channel matrix, H. The system 2300 is able to implement theequivalent of a MIMO system, but makes use of the fact that a form ofspatial modulation is used where a non-zero modulated signal is onlyapplied to one of the active optical signature filters in the SIMOactive optical filter bank/combiner 2320 at a time. As a symbol passesthrough filter number i in the optical filter bank 2320, the filterswill be designed so that as much energy as possible of this symbol willpass through filter i in the receive filter bank 2330, while as muchenergy as possible is blocked from passing through the other filters,j≠i. The optical signature channel number corresponding to the receivefilter that has the highest energy generally corresponds to the spatialconstellation point's coordinate.

In preferred embodiments, the SIMO active optical filter bank/combiner2320 and the SIMO active optical receive filter bank 2330 act as a setof matched filters that are maximally orthogonal. That is, if the lasermodulated signal s(t) is coupled to channel i of the SIMO active opticalsignature filter bank 2325, then the optical receive filter i of theSIMO active optical receive filter bank 2330 will be matched to provideat its output as much of the signal s(t) as is possible. Also, the restof the optical receive filters in the active optical receive filter bank2330 will be designed to provide at their output as little of the signals(t) as is possible. This can be achieved, for example, by designing thecascade of each pair optical signature filter/optical receive filter ito be (as close as possible) to an orthogonal basis vector relative toall the other filter channels, j≠i. For systems like STSK, wheredispersion matrices are used, the filters in the signature and receivefilter banks can be designed in accordance with a desirable and selectedset of fixed dispersion matrices, for example.

The output of the SIMO active optical receive filter bank 2330 iscoupled to a coherent detector and processor/memory interface 2335. Thecoherent detector and processor/memory interface 2335 uses a set ofn_(r) coherent detectors to convert the n_(r) different (multiple)outputs from the SIMO active optical receive filter bank 2330 fromoptical signals to electrical signals. The n_(r) coherent detectors aresampled at a sample instant and converted into n_(r) differentrespective digital signals, each with real and imaginary components(complex numbers) corresponding to the I and Q components. The set ofn_(r) received complex-number signal points are then stored in a memory.The memory is preferably arranged in an ordering related to the orderingof the signal points as observed at the receiver front end where thesignal points are sampled. This memory preferably is double buffered andkeeps track of all of the information related to the received signalthat was received in each symbol interval of a coding frame from each ofthe spatial channels (outputs of each of the optical receive filters inthe SIMO active optical receive filter bank 2330). While one memory isbeing processed by a SISO decoder 2240, another memory is being loadedfrom new information received from the optical channel.

The information that is stored in the memory associated with the block2335 will be used to compute an initial set of bit metrics to be used inthe SISO decoder 2340 that is operably coupled to the memory. Theinformation stored in the processor/memory interface portion of theblock 2335 is processed via a CICM deinterleaver 2350 and used tocompute the initial set of bit metrics used by the SISO decoder 2340.Each time the SISO decoder 2340 executes a SISO decoding iteration, theupdated extrinsic information is processed via a CICM interleaver 2345and used to compute updated bit metrics. The updated bit metrics arethen passed through the CICM deinterleaver 2350 for use in the next SISOiteration. The SISO decoder is allowed to compute SISO iterations untila convergence criterion or stopping condition is met. The output of theSISO is a decoded frame of information bits which exits the SISO decoderon the output arrow to the right to the right of the SISO decoder block2340. In a preferred embodiment the SISO decoder is designed to decode aCTBC code. As discussed above, the CICM approach can also be used withother types of codes such as block codes, convolutional code, turboproduct codes, and depending on the actual/particular selected code,certain turbo codes and LDPC codes where the P(d) table can bedetermined.

In some alternative embodiments the constellation and spatial mapper2310 can be implemented using other spatial modulation techniquesinstead of CICM. In non-CICM SM embodiments of the system 2300, theP(d), for d=d_(t), d_(t)+1, . . . , d_(f) tables do not need to bedeterminable. For example, if the bit stream is encoded using an LDPCcode for which these tables cannot be determined, possibly concatenatedwith large block codes or the like as is used sometimes in OTN, or iffor any other reason CICM is not desired to be used in a givenembodiment, then block 2310 would perform any selected SM algorithmother than CICM-SM to constellation map and spatially map the codedinput bits coming into the left of block 2310 onto a sequence ofconstellation/spatial constellation points. That is, the system 2300 isgeneral enough to be used with CICM or any other SM identifiedtechnique. Key novel features beyond SRCI CTBC codes and CICM includethe use of the 2320 and 2330 and other blocks in the system 2300 thatallow the optical signal that traverses the optical channel to beprocessed as a SM type signal similar to the way shown in FIG. 23.

That is, an aspect of the present invention as relates to the use of anoptical filter bank 2320 to transform an optical-modulated laser signal(output of block 2315) to an SM-multichannel signal. The SM-multichannelsignal can be viewed as a collection of n_(t) number of optical filterbank channel outputs of optical filter bank inside block 2320. Theoptical filter bank preferably includes a collection of discrete-timeoptical filters arranged in parallel and the implementation ispreferably using multistage architectures as are known to those of skillin the art via the Madsen reference and the numerous citations torelated work provided therein. The optical filter bank inside block 2320uses the same structure as shown in blocks 2105, 2110, but instead ofthe SM-multichannel signal exiting from multiple antennas and passingthrough transmit portion, H_(t), of the channel matrix H, the opticalSM-multichannel signal is the multichannel output of the optical filterbank (H_(t)) inside block 2320. The block 2320 also includes a combinerthat is coupled to receive the SM-multichannel signal and combine then_(t) number of multichannel component signals to form a singlemodulated laser signal. The combining operation is equal to or similarto a summation operation and is preferably carriedout/implemented/embodied using one or more optical combiners. The outputof the block 2320, i.e., the single modulated laser signal is thentransmitted onto the optical channel. At a receiver, 2330 2335, 2340,2345, 2350, a noisy and optical-channel-distorted version of the singlemodulated laser signal is then received from optical channel. At thereceiver, the SIMO optical filter bank 2330 or a variation thereof isused to decompose the single modulated laser signal into a plurality ofmultichannel component signals. This plurality of multichannel componentsignals can be viewed as a reconstruction of an estimated version of theSM-multichannel signal. While the preferred embodiment uses amultichannel SISO decoder 2340, 2345, 2345 to decode the estimatedversion of the SM-multichannel signal, other types of decoders canalternatively be used with the present invention. Other types ofdecoders would include hard iterative decoders, or any other type ofdecoder used to decode any kind of code, such as an LDPC decoder,possibly in operation with block code decoders, or turbo productdecoders as are commonly used in OTN applications, or the like. Evensimple channel decoders such as a multichannel equalizer followed by aconventional slicer/decision circuit could be used in the place ofblocks 2340, 2345, 2350 in FIG. 23.

It can also be noted that an aspect of the present invention as per thesystem 2300, in its broader context, teaches a broader genus ofinventions that need not necessarily be implemented in optics. As iswell known, digital filter banks can be readily implemented. Forexample, fred j harris, “Multirate signal processing for communicationsystems,” Prentice-Hall, 2004 (“the harris reference”) describes howmultirate digital signal processing techniques can be used to constructdigital filter banks that involve sub-band processing. Multirate signalprocessing make extensive use of bandpass sampling and can be useful tolower the computational load associated with filtering a band passmodulated signal, especially in embodiments where the modulated signaloutput of 2315 is centered at a RF (radio frequency) carrier frequency.Single rate digital filter banks can also be readily constructed andused in embodiments that do not perform resampling but have all parallelfilters operating in all filter channels at a single sampling rate thatis the same as the input and/or output signals. Also, MISO (multipleinput, single output) type digital filter banks are well known that canbe viewed as having multiple parallel inputs that feed to multipleparallel filter channels, and a summing junction (digital combiner) thatis used to add the outputs of the multiple parallel filters to provide asingle output. Therefore, the block 2320 could be implemented as a MISOdigital filter bank using purely digital hardware. Similarly, SIMOdigital filter bank could be used to implement the block 2330. A SIMOdigital filter bank sends a single input stream to multiple paralleldigital filter channels and provides a multi-channel output signal.Being digital, digital filter banks can be implemented using one orinstruction set processors coupled to memory. This could be dedicatedhardware or shared with other digital signal processing hardware in thesystem.

In all-digital embodiments, the laser 2305 and the optical modulator2315 are replaced by a standard digital physical layer channel interfacesuch as an BPSK, QPSK, QAM, OFDM, or any other kind of modulator for agiven channel that can be used with spatial modulation. The modulatedsignal output of non-optical version of physical layer block 2315 ispassed to the block 2320 in digitized form which performs SIMO filterbank operations. Since in SM only one transmit channel is used at atime, the spatial modulation constellation point coming from f₂ of block2310 will select a filter from the SIMO filter bank to be applied to themodulated signal during a given symbol interval. The output of the block2310 will thus be a filtered version of the modulated signal, where aselected filter from the digital filter bank 2320 is applied each symbolinterval. The filters inside the digital filter bank 2320 are preferablyselected to allow the spatial modulation constellation point to beresolved at the receiver. The output of the block 2320 can be sentdirectly to a digital to analog converter (DAC), or to a line/channelinterface that includes and an analog reconstruction filter. In 5Gwireless and similar types of wireless embodiments, the channelinterface could be an air interface such as used in 3G, 4G or 5Gcellular, or as used in WiFi wireless local area networks, or as used in802.16 type WiMAX systems. In other types of embodiments, the lineinterface could correspond to a DSL (digital subscriber line) broadbandtwisted pair telephone line, or could correspond to a cable modem typechannel interface.

An advantage to using this alternative MISO/SIMO approach 2300 asopposed to a multi-antenna embodiment is that the filters in the digitalfilter bank can be made to be adaptive. The filter response of adaptivefilters can be changed varied and a function of current channelconditions. Therefore, adaptive filters can be adjusted or otherwisechanged or updated to improve the properties of the overall channelmatrix, H. While multi-antenna embodiments rely on a complicated MIMOtype channel model, the above described MISO/SIMO approach 2300 canbetter select and control the overall channel matrix, H. Anotheradvantage is that while current SM systems as used in cellular networkscan have a large number of antennas in the downlink from the basestation to the mobile, the mobile unit (handset) itself can only have asmall number of antennas due to mobile-unit size constraints. When theabove described alternative MISO/SIMO approach 2300 is used, the mobileunit can have a large number of equivalent SM channels. Also, even inthe base station, it can be more cost effective to implement themultiple downlink channels using a digital filter bank because thiseliminates extra antennas and also provides more control in selectingand maintaining a desired channel matrix, H.

The digital, analog, or discrete time filter banks can be used toimplement the transmitter and/or receiver portions of the channel matrixH. That is, the SM/MIMO channel matrices H_(t) and H_(r) (whereH=f(H_(t), Li_(ch), H_(r))) can be implemented using the filter banks inthe transmitter and/or receiver, and H_(ch) would be the actualcommunication channel. The actual communications channel matrix H_(ch)may be a scalar if both the transmitter and the receiver include theabove-described filter banks and only one antenna or only one lineinterface is used. Also, mixed systems that use filter banks toimplement H_(t) and H_(r) but also use multiple antennas/physicalchannels may also be constructed. Similar to the optical embodimentdescribed above, the filters in these digital, analog or discrete-timefilter banks can be selected to be orthonormal, square-root orthonormal,or some other type of non-orthogonal basis functions such as an all-polefilters or filters with poles and zeros, That is, it is not required toimplement an orthonormal basis set of filters, nor an approximationthereto. All that is really needed is to ensure that the filters areselected so that the overall channel matrix H is invertible, full rank,or has enough rank so that the SM-modulated or MIMO-modulated signalscan be suitably recovered/reconstructed after passing through thechannel H. The transmit and receive filters will influence H as will anynoise and distortion effects of the optical channel itself.

In yet another embodiment, instead of implementing the blocks 2320 and2330 using parallel MISO and SIMO digital filter banks, either analogfilter banks or discrete-time filter banks (such as tapped delay linesor SAW (surface acoustic wave) filter banks are used. That is, themodulated signal such as the BPSK, QPSK, QAM, OFDM, or any other kind ofmodulated signal for a given channel is generated at the block 2315 andthe block 2320 is operative to pass the modulated signal to a selectedanalog or discrete-time filter, where the selection is made inaccordance with the spatial constellation point supplied by the spatialmodulator during the give symbol interval. That is, all operations aresimilar to the above-described digital filter bank embodiment/approach,except the DAC operation is performed before the block 2320 instead ofafter it, and ADC (analog to digital conversion) is applied after theblock 2330 instead of before it. Such embodiments are practical in somecases because the center frequency and/or the bandwidth of the modulatedsignal can make the digital filter bank operations require very highprocessing speeds. It is envisioned that a single chip could be used toimplement the blocks 2320 and 2330 for use in a handset using eitheranalog filter technology or discrete-time filter technology, and in thecase of block 2320, under digital selective control in accordance withthe spatial modulation constellation point coming in each samplinginterval from the block 2310 (Γ₂).

In the discussion below, FIG. 24 is described in connection with an OTNoptical communications embodiment. While FIG. 23 deals with an SMembodiment, FIG. 24 deals with a MIMO embodiment. The key differencesbetween the systems/methods of FIG. 23 and FIG. 24 is thus while block2320 applies one filter during each sample interval, block 2420 appliesa plurality of filters to a plurality of different modulated signalseach symbol interval. Hence all of the above discussion relating toembodiments of FIG. 23 that alternatively use digital filter banks,analog filter banks, or discrete-time filter banks also applies to theMIMO-modulation embodiment of FIG. 24.

CICM MIMO OTN Embodiment:

Referring now to FIG. 24, a transmitter, a channel and a receiver forCICM based optical MIMO communication is illustrated in block diagramform. The transmitter and receiver depicted in FIG. 24 can be embodiedas a method, an apparatus, a device, or a system. A set of n_(t) numberof lasers 2405 provides a set of n_(t) number of optical carrier wavesfor optical communications. Similar to the system 2300, where densewavelength division multiplexing (DWDM) is used, then the bank of lasers2405 and the entire system 2400 would be repeated for every opticalchannel and polarization in the DWDM channel bank. For example, if theDWDM system had 80 channels, then the bank of lasers 2405 and the system2400 would be repeated 160 times, once at each of the 80 wavelengths,and once for horizontal and vertical polarizations at each wavelength.From here forward, the system 2400 will be described at a singlewavelength, and a single polarization, knowing that the followingdiscussion of the system 2400 could be repeated at each wavelengthand/or each polarization.

A constellation and spatial mapper 2410 is provided to receive an inputbit stream which is presented to the block 2410 on the input arrow tothe left. For example, the input bit stream can be a CTBC encoded bitstream. As discussed earlier in connection with CICM, the input bitstream can be any coded bit stream for which a set of tables P(d), ford=d_(t), d_(t)+1, . . . , d_(f) can be constructed. In the MIMOembodiment 2400, the signal mapper 2410 can optionally include a spatialconstellation mapper component. This optional spatial mapping componentis shown in dotted lines as an optional output from the mapper 2410 to aMIMO optical signature filter bank/combiner block 2420. In embodimentswhere the optional spatial mapping component is supplied by the signalmapper 2410, the signal mapper 2410 becomes a constellation and spatialmapper 2410 (as shown in FIG. 24). Block 2410 preferably includes aconstellation-mapper so that its output includes a constellation signalpoint. Also, in such embodiments, the block labeled 2420 becomes a MIMOactive optical signature filter bank/combiner block 2420. The activecomponents can optionally be used even when there is no spatialmodulation component coming from the signal mapper 2410, but when such aspatial modulation component is present, the active optical filterand/or amplifier components are needed in the block 2420 to allow it toadapt its transmit MIMO transfer function in accordance with the spatialmodulation component. An example of the this type of system is a GSMembodiment and other types of embodiments where a subset comprising morethan one antenna can be selected during each symbol interval by thespatial modulation component.

In embodiments that use MIMO vector modulations such as V-BLAST andD-BLAST, the constellation and spatial mapper 2410 performs spatialmapping by providing n_(t) number of signal constellation points to betransmitted via the vector, x, each symbol interval. In such modulationsthe dotted arrow coming from the spatial modulation sub-matrix, Γ₂, isempty and there is no separate spatial modulation matrix Γ₂. Instead, inthis case, the spatial portion of the modulation is performed by virtueof mapping n_(t) number of signal constellation points to be transmittedvia the vector, x, each symbol interval.

The outputs of the lasers 2405 couple to the laser-inputs of a bank ofoptical modulators 2415. The optical modulators each receive a secondinput of m bits each, where each m-bit input is representative of asignal constellation point that tells each respective optical modulatorhow to modulate its laser input to produce a modulated laser output. Themodulation is performed in accordance with the signal constellationpoint supplied as a column of the submatrix Γ₁ by the constellation andspatial mapper 2410. For example, when m=4, each column of the submatrixΓ₁ could identify n_(t) groups of four coded bits each that respectivelyidentify a respective 16-PSK signal point to which the each respectivegroup of m=4 coded bits will be mapped in a given symbol interval. Inthis example a particular CTBC code is used to encode the bit streaminput to the block 2410. During each symbol interval, the encoded bitstream is mapped to a vector comprising n_(t) CICM-16-PSK constellationpoints. For example, if n_(t)=4, this will increase the data rate byn_(t)=4 times the data rate of a single CICM-16-PSK channel. Ifn_(t)=16, this will increase the data rate by n_(t)=16 times the datarate of a single CICM-16-PSK channel. In general, this type oftransmission can provide a speed up of n_(t) times the data rate of aconventional system that does not use the MIMO processing of the system2400.

In the system 2400, to maintain a system-wide symbol Hamming distance,the CICM mapping rule design algorithm can be configured to create aCICM constellation and spatial mapper that does not allow more than onebit from any given low weight error sequence to be transmitted duringany given symbol interval. In such embodiments, a single CICMpermutation matrix, Γ, is designed with m*n_(t) number of bits percolumn. In such embodiments there will be n_(t) number of columns thatcorrespond to each Euclidian distance in the signal constellation. Inother embodiments, the symbol Hamming distance will only be effectivewithin each one of the n_(t) different channels. In such embodiments, asingle CICM permutation matrix, Γ, can be designed that has m rows andK/m columns as in the single-channel case. Now, however, a set of n_(t)columns of Γ will be mapped to separate filter channels during eachsymbol interval.

Also, as applies to the system 2300 as well, if transmission isoccurring on both the vertical and horizontal polarizations, the CICMmapping rule can be designed to treat the horizontal and verticalpolarizations as being imperfectly coupled, in which case it is desiredto view the symbol Hamming distance as involving the bits sent on boththe horizontal and vertical polarizations during a given symbolinterval. However, because it is common to apply a small 2×2 rotationmatrix to correct for imperfections in the horizontal and verticalpolarizations, the CICM mapping rule can be alternatively designed totreat the horizontal and vertical polarizations as being perfectlyisolated, in which case it is desired to view the symbol Hammingdistance as involving the bits sent only on the horizontal or thevertical polarization during a given symbol interval. If the verticaland horizontal polarizations are considered to be perfectly isolated,then one column of Γ will be mapped to the horizontal polarization andanother column of Γ will be mapped to the vertical polarization duringeach symbol interval. Similarly, when the systems 2300 and 2400 areapplied in DWDM systems, since the different wavelengths can generallybe considered to be isolated, and assuming isolated/correctedpolarizations, a full 160 columns of Γ can be mapped each symbolinterval. As mentioned above, soft interference cancellation can also beused with the SISO decoder to cancel the effect of the polarizationcross talk.

The MIMO optical signature filter bank/combiner 2420 receives alength-n_(t) vector of signal constellation points. If the submatrix Γ₂is in use, a column from Γ₂ indicates a subset of the n_(t) vectorinputs to process during a given symbol interval. The n_(t) outputs ofthe internal optical signature filters within the MIMO optical signaturefilter bank is sent to a combiner that is located within the block 2420.The combiner can be implemented using known optical technology toinclude merging optical paths in an optical integrated circuit or fibercouplers that have multiple optical input fibers which are lengthmatched and combined to form a single output. The single output iscoupled to an optical channel such as a fiber optic cable or a freespace laser channel. The output of the optical channel is coupled to aSIMO active optical receive filter bank 2430. Each of the active opticalsignature filters 2420 and optical receive filters 2430 can beimplemented in accordance with known technology, as described above inrelation to the Dowling, MacFarlane and Madsen references, for example.The structure and operation of the optical receive filter bank 2430 islargely the same as described in connection with FIG. 23.

Because in this case the vector x can be carrying information symbols onup to all 11_(t) channels per symbol interval, the action of thecombiner in the block 2420 will be to form a linear combination of allof the columns of the submatrix H_(t) each interval and eventually thematrix H each frame. These signals can be separated in the receiver aslong as all of the columns of H_(t) and/or H are linearly areindependent. The more orthogonal the columns of H_(t) and/or H are, theeasier it will be to effectively invert these matrices using anpredetermined and matched orthogonal matrix in the receiver. Thetransfer functions inside the H_(t) submatrix are optical transferfunctions and are applied during each symbol interval.

If the portion of the channel matrix H of equation (31) that is activeduring any symbol interval is factored as H=H_(r)H_(t), then the MIMOoptical signature filter bank can be viewed as having a matrix transferfunction of H_(t) while the SIMO active optical receive filter bank 2430can be viewed as having the transfer function H_(r). In certainpreferred embodiments, the channel matrices are constructed usingorthogonal basis sets so that the matrix H is, or approximates, aconstant times an identity matrix, and the matrices H_(r) and H_(t) canbe viewed as orthogonal filter matrices.

The blocks labeled 2435, 2445 and 2450 perform similar functions andhave similar structures to the corresponding blocks 2335, 2345 and 2350in FIG. 21. However, in addition to the operations as describedconnection with FIG. 21, the data in the memory of block 2435 and theSISO decoder/interference canceller 2440 also performs one or moreadditional functions. The reason is that in the system 2300, data isonly be transmitted through one active signature filter at any time, atleast to within relatively short-timed dispersion matrix time constants.In the system 2400, the SISO decoder/interference canceller 2440 needsto detect each of the n_(t) transmitted signals in the presence ofinterference due to the other signals transmitted during the sameinterval through different optical signature filters in the MIMO opticalfilter bank/combiner 2420. Therefore the SISO decoder/interferencecanceller 2440 uses the same structure and performs the same functionsas described in connection with the block 2125 in FIG. 21.

OFDM Related Embodiments:

Another class of embodiments contemplated by the present inventioninvolves OFDM (orthogonal frequency division multiplex) systems, alsoknown as DMT (discrete multitone) type systems. These systems map anentire frame of data onto a set of N carriers, usually using a DFT(discrete time Fourier Transform) that is implemented using an FFT (fastFourier Transform) and its inverse transform. In such systems, each subcarrier is typically viewed as carrying one QAM type data symbol eachframe. The collection of all QAM symbols on all sub carriers per OFDMframe is called an OFDM symbol. The OFDM symbol interval is thus theOFDM frame size plus possibly a cyclic prefix duration that is used as aguard interval to separate the OFDM symbols in time enough so that FFTprocessing can be employed in a demodulator. As is known in the art, thecollection of QAM data symbols on the various subcarriers can be encodedusing FEC and/or TCM.

In accordance with an aspect of the present invention, the OFDM symbolis formed by modulating the carriers using a CTBC encoded data sequence.The CTBC code's frame size may be the same as the number of bits mappedper OFDM symbol interval, or for example, one CTBC frame of data can bemapped to an integer or fractional number of OFDM symbols. For example,if K=1024 and the number of subcarriers is 256, then the CTBC framewould be carried by four OFDM symbols. If K=1024+128=1152, then the CTBCframe would be carried by four and a half OFDM symbols. In actual OFDMsystems, certain sub carriers may be used as reference tones forsynchronization purposes, but the general idea is that CTBC encoded datamay be mapped to one or more OFDM symbols.

Also, in the OFDM transmitter, instead of using TCM-QAM, for example,CICM-PSK could be used to modulate each sub-carrier. In systems like DSL(digital subscriber line) modems, where different numbers of bits aremapped to different sub-carriers depending on channel conditions,different sized PSK constellations could be used at differentsubcarriers. The CICM permutation and constellation encoding could becarried out separately for each subcarrier, or could be carried outacross subcarriers, k, depending on the embodiment. Hence the presentinvention specifically contemplates all variations using CICM in thetime and subcarrier domains or a combination of both.

In certain embodiments of the present invention would encode a framedata into a K-bit CTBC encoded frame, then apply CICM interleaving, andthen a reverse Gray coded or other similar constellation mapping such asanti-Gray coding to each subcarrier. For example a CICM-16-PSK could beused to modulate each subcarrier. Depending on the embodiment, eachsubcarrier could carry a separate CTBC/CICM encoded data frame, or asingle data frame could be spread across the entire set or a subset ofthe subcarriers. A 5G LTE system could be designed using a CTBC code andCICM-PSK type modulation.

Also, embodiments of the present invention are envisioned that do notuse CTBC codes but instead use any of the types of codes that arediscussed above that can be used with CICM. For example, as discussed inthe unequal error protections section above, a single (8,4) or longerblock code could be used with CICM to provide equal or unequal errorprotection. In an OFDM embodiment, the CICM can be used with anysuitable code as discussed above and does not need to be a CTBC code.That is, any suitable code can be used to create a valid CICM signalmapper with a CICM permutation and a selected constellation mapper, andthis CICM can then be used to modulate either a single subcarrier orcould be spread across multiple sub-carriers. If CICM is used with ablock code or a convolutional code, something similar to TCM results,however CICM can perform better than TCM over fading channels.Therefore, the current TCM-QAM used in various standards to modulatesubcarriers can be substituted with an appropriate CICM scheme, such asa CICM-PSK scheme that is derived from a block code or a convolutionalcode. Other codes such as turbo product codes, turbo codes and LDPCcodes can also be used as long as their P(d) tables can be constructedas discussed above.

As is discussed in R. Y. Mesleh et al., “Spatial Modulation,” IEEE TRVehicular Technology, Vol. 57, No. 4, July 2008, pp. 2228-2241, (“theMesleh reference) both SM-OFDM and VBLAST-OFDM (MIMO) are known. Thepresent invention contemplates that known SM-OFDM and MIMO-OFDM can alsobe improved by using CTBC coding of the bit stream, and/or CICM encodingof each subcarrier, preferably using CICM-PSK for each subcarrier'sconstellation mapping. The present invention also contemplates using theoptical and/or non-optical versions of the SM and MIMO systemconfigurations as shown in FIGS. 21, 23 and 24. Moreover, especially insystems implemented in accordance with FIGS. 23 and 24, whether they beimplemented using an optical or wireless, or wireline channel, whenblocks 2320, 2330, 2420, 2430 are used, it is noted that a much highernumber of channels can be practically implemented as compared to priorart multiple antenna SM-OFDM and MIMO-OFDM (e.g., VBLAST-OFDM) baseddesigns that require N_(t) transmit and N_(r) receive antennas toimplement the spatial channels. With the present invention SM-OFDM andMIMO-OFDM can even be implemented in systems where there is only onetransmit and antenna and one receive antenna. The present inventionallows N_(t) and N_(r) to represent, instead of the number of transmitand receive antennas, the number of outbound spatial channels leavingthe transmitter and the number of inbound spatial channels entering intothe receiver. This allows N_(t) and N_(r) to be made to be much higherwithout the cost adding extra antennas. When N_(i)=N the spectralefficiency of SM-OFDM rises as a function of log₂(N_(t)) and thespectral efficiency of MIMO-OFDM rises linearly with N_(t).

Referring now to FIG. 25, an embodiment of a system, a method and anapparatus 2500 is shown in block diagram form. Inbound to a SM-OFDMspatial constellation mapper 2505 is an OFDM symbol matrix, Q(k). Inthis context the parameter k is used to denote frequency domain data.The subcarriers can be viewed as being indexed by k (or k−1). The matrixQ(k) can be viewed as being similar to the matrix Γ, but adapted to anOFDM frame size. That is, suppose each OFDM frame has N number ofsubcarriers. In this example, assume all subcarriers are used to carryuseful information bits, although it is understood that in otherexamples, certain subcarriers could be reserved to carry knowntiming/synchronization symbols. Also assume, in this example, that eachsubcarrier is modualated with m data bits so as to carry a 2^(m)-arydata symbol. Also, let m_(spatial)=log₂(N_(t))=log₂(N_(r)). With theseparameters defined, then the matrix Q(k) can be viewed as a(m+m_(spatial))×N matrix of binary bits. Similar to how Γ is defined inSM and MIMO applications with the submatrices Γ₁ and Γ₂, Q(k) can beviewed as having a Q₁(k) binary submatrix of size m×N stacked above aQ₂(k) binary submatrix of size m_(spatial)×N. When CICM is optionallyalso being applied in a given embodiment, the matrices Q(k), Q₁(k), andQ₂(k) can be loaded by sliding a window across the Γ matrix so that ablocks of N columns of Γ are mapped to each OFDM symbol.

Similar to how the bits in the Γ₂ matrix are used in FIG. 23 and FIG. 24to select a sequence of spatial channels to be used to send each columnof the Γ₁ matrix, the bits in the Q₂(k) are used to select a sequence ofspatial channels to be used to send each column of the Q₁(k) matrix.However, in the OFDM embodiment of FIG. 25 each of the columns, numbered1, . . . , N of the Q₁(k) matrix will be sent on a respective OFDMsubcarrier, labeled 1, . . . , N. Hence the operation performed by theSM-OFDM spatial constellation mapper 2505 can be viewed as transformingthe matrix Q(k) to a set of spatial channel sequences, X₁(k), . . . ,X_(Nt)(k). Again, here the k-parameter indicates that the X-vectorscontain frequency domain data. In general, if there are N subcarriers,N_(t)≦N is preferably chosen such that N/N_(t) is an integer, i.e.,N_(t) divides N and in the extreme case, N_(t)=N, so thatm_(spatial)=log₂(N). This is extreme case is now practical to implementsince separate antennas are not needed to increase N_(t). Therefore theexample embodiment where N_(t)=N becomes important and desirable.

To understand the function of the block 2505 to perform the mapping Q(k){X_(I) (k), . . . , X_(Nt)(k)}, consider all of the elements of thevectors {X_(i)(k), . . . , X_(Nt)(k)} to be initially set to zero. Notethat the k^(th) column of Q₁(k) corresponds to a complex-valued signalconstellation point to be transmitted onto the k^(th) subcarrier and tobe carried by the spatial transmit signature channel number given by thek^(th) column of Q₂(k). Hence similar to block 2310 and 2410, block 2505will sequentially map each k^(th) column of Q₁(k) to a constellationpoint in the k^(th) position of a selected channel vector, X_(sp)(k),where the subscript spε{1, 2, . . . , N_(t)} is equal to the binaryvalue of the k^(th) column of Q₂(k). Since all of the elements of all ofthe vectors {X₁(k), . . . , X_(Nt)(k)} were initialized to zero, by thetime each column of Q₁(k) has been mapped to a complex number in thek^(th) position of the Q₂(k)-selected channel vector, X_(sp)(k), thevectors {X₁(k), . . . , X_(Nt)(k)} will contain all zeros except for thefrequency bins k to which a complex number corresponding to a signalconstellation point have been inserted. Because each frequency bin k ismapped to one channel, it will never be the case that any one spatialchannel sends more than one signal constellation point on any onesubcarrier at a time. If the {X₁(k), . . . , X_(Nt)(k)} are viewed asrow vectors all stacked into a matrix X(k), the matrix X(k) will be ofsize N_(t)×N, and each k^(th) column of X(k) will contain all zeros,except for the sp^(th) row, which will contain a complex number. Herethe spatial channel number, sp, corresponds to the binary value of thek^(th) column of Q₂(k) and the complex number corresponds to the signalconstellation point determined by constellation mapping the k^(th)column of Q₁(k). Next the frequency domain set of spatial channelvectors are sent to a frequency domain filter bank 2510 where theequivalent of time-domain filtering (convolution) is implemented usingpoint-wise multiplications in the frequency domain. If needed a cyclicprefix or guard interval can be used to ensure that the convolutionaltails between OFDM symbols is maintained. The signature filter bank 2510thereby applies a frequency domain spatial-channel signature value toeach frequency bin in each transmit spatial channel. To keep the guardband short, the signature filters can IFFT back to zero paddedimpulse-response vectors in the time domain. The outputs of thefrequency domain filter bank are then summed together in a summingjunction and are then the sum is sent to an inverse FFT/OFDM modulator2515. The time-domain output signal from this OFDM modulator is thenconverted to analog or otherwise coupled onto a physical channel fortransmission. The physical channel can be wireless, wireline, oroptical, and in general, can involve one or more antennas, although apreferred embodiment only uses one antenna at the transmitter and oneantenna at the receiver to implement the equivalent of the matrix MIMOtype channel, H.

In the embodiment shown in FIG. 25, the received signal R(t) is firstsent to a single FFT module/OFDM demodulator 2520. The frequency domainversion of the received signal, R(k) is then sent to a SIMO signaturefilter bank 2525 (similar to blocks 2330 and 2430) that is applied inthe frequency domain to generate a plurality of frequency domainsignals, {Y₁(k), . . . , Y_(Nt)(k)}. This provides a set offrequency-domain vector signals output signals in accordance withy=Hx+w, i.e., equation (31). Because no spatial channel has more thanone active antenna (signature filter) at any given particular time atany given particular frequency bin, normal SM type detection can beapplied. For example, a channel estimator 2530 can perform maximumreceive ratio combining (MRRC) as is known in the art to estimate asequence of channels corresponding to the bits in the submatrix Q₁(k).Other sequence estimation techniques could also be applied as discussedin connection with CICM and other embodiments herein. For example, ifthe number of columns of Γ is five times N, then five OFDM symbols couldbe concatenated and decoded together. Whether joint spatial+signalsequence estimation is used, or whether the channel number sequence isestimated first and then used to help decode the signal symbols second,the end result is to estimate both the bits in both Q₁(k), and Q₂(k).Any variety of joint channel number and signal SISO decoding, hardchannel number decoding followed by SISO decoding to recover the Q₁(k)sequence, or any other known decoding scheme can be applied as discussedabove in connection with FIG. 23. As described in the Mesleh reference,a vector g(k)=H^(H)y(k) where this H the size N_(t)×N_(t) frequencyresponse matrix of a given subchannel to account for all the crosscouplings between transmit and receive spatial channels, and y(k) is oneof the {Y₁(t), . . . , Y_(Nt)(t)} vectors. The g(k) vector or vectorssimilar to it can be used in various embodiments as a sufficientstatistic/measure to be used to determine which antenna is transmittingat each subcarrier frequency using standard energy detection basedalgorithms. At each subcarrier, one antenna will be active and the restquiet/dark as is common in all SM detection discussed herein.

In the alternative embodiment shown in FIG. 26, a SIMO signature filterbank 2620 (similar to blocks 2330 and 2430) is applied in the timedomain to generate a plurality of time domain signals, {Y₁(t), . . . ,Y_(Nt)(t)}. This provides a set of time-domain vector signals outputsignals in accordance with y=Hx+w, i.e., equation (31). Each of thesetime domain signals is then processed by a set of N, an FFT/OFDMdemodulators 2525 arranged in parallel to provide a set of frequencydomain equivalents {Y₁(k), . . . , Y_(Nt)(k)}. Note that FIG. 26 isclose to known SM-OFDM, but requires more FFTs for OFDM demodulationthan the more efficient frequency domain embodiment 2500 that was notpossible with prior art SM-OFDM. FIG. 26 also uses a plurality of OFDMFFT based demodulators 2615 as opposed to the more efficient approach2515. However FIG. 26 is closer to the prior art and illustrates wheresome of the complexity reductions were achieved as compared to FIG. 1 ofthe Mesleh reference. Another alternative embodiment would be to swapthe order of blocks 2610 and 2615 and to thus implement the block 2610in the time domain. This could be done in digital signal processinghardware or in an analog or discrete time filtering chip.

Also, while FIG. 25 and FIG. 26 were described in connection withSM-OFDM, the same structures support MIMO-OFDM as well. The differencenow that the number of spatial channels can become relatively large, ashigh as N_(t)=N with only one antenna at each of the transmitter andreceiver, it makes sense to allow the spectral efficiency of the systemto scale linearly with N_(t) as opposed to log₂(N_(t)). The added costwould be successive interference cancellation or a similar detector asdescribed in connection with FIGS. 21, 22 and 24 and as described moregenerally in the context of MIMO-modulation.

System Level and Alternative Embodiments:

FIG. 27 shows a higher level systems architecture 2700 into which any ofthe CI (constrained interleaving) and/or CICM and/or SM and/or MIMOtechniques described herein may be used. A headend system 2705 transmitsvia a downlink channel to user device 2710. The user device 2710transmits back to the headend system 2705 via an uplink channel using aphysical layer that includes coding and modulation. The headend systemcomprises a protocol stack 2720 which includes a physical layer device2724. The physical layer/coding layer devices 2725 2732 implement anycombination of one or more of CTBC codes, CICM, SM and/or MIMO usingconstrained interleaving and any other coding and modulation techniquesas described in this patent application. The headend system also mayinclude a control and routing module to connect to external networks,databases, and the like. The headend system also contains a computercontrol module 2729 which comprises processing power coupled to memory.The computer control module 2729 preferably implements any maintenancefunctions, service provisioning and resource allocation,auto-configuration, software patch downloading and protocol versionsoftware downloads, billing, local databases, web page interfaces, upperlayer protocol support, subscriber records, and the like.

The user terminal 2710 similarly includes a physical layer interface2732, a protocol stack 2734 and an application layer module 2736 whichmay include user interface devices as well as application software. Theuser terminal 2710 also may optionally include a packet processor 2738which can be connected to a local area network, for example. The user2710 terminal may also act as an IP switching node or router in additionto user functions in some embodiments.

Another type of embodiment replaces the headend system 2705 with anotheruser device 2710 in which case direct peer-to-peer communications isenabled. In many applications, though, the headend can act as anintermediary between two user devices to enable indirect peer-to-peercommunications using the same headend-to/from-user deviceuplink/downlink architecture illustrated in FIG. 27. Also, a pluralityof networked headends may be employed to the same effect, for example,in a cellular communication system (where the headends are implementedas cellular base stations). Likewise in OTN applications, switching androuting nodes in the backbone may be viewed as peer-to-peer,headend-to-headend connections. In high speed optical LAN applications,the headend may be viewed as a LAN controller and the user device can bea optical LAN connected device.

In a preferred embodiment as directly illustrated by FIG. 27, at leastone of the uplink and the downlink channels is implemented using one ormore or an combination of of the members of the family ofencoding/modulation/demodulation and decoding schemes as describedherein, such as CTBC codes, CICM, SM and/or MIMO. In some types ofembodiments, the PHYS 2724, 2732 may also include echo cancellation,cross-talk cancellation, equalization, and other forms of signalconditioning or receiver preprocessing. Alternatively, the headend 2705and the user station 2710 can be implemented as nodes in a network wherethe physical layer devices 2724, 2732 implement a backbone communicationconnection between nodes.

Another aspect of the present invention contemplated by FIG. 27 is theprovision of services by a communication services provider. Thecommunication service provider provides a communication service such as,for example, a cellular communications service to a set of subscribers,a wireless data service, or supplies a backbone optical communicationservice to support a network such as the Internet. The service providerimplements FIG. 27 or any of its variants or equivalents describedabove. The service provider employs the PHYS 2724, 2732 in support ofthe service. In some cases the service also provides the user devices2710 to the subscribers. This allows the service to be implemented moreefficiently and economically that was available with prior art codingtechnologies.

Although the present invention has been described with reference tospecific embodiments, other embodiments may occur to those skilled inthe art without deviating from the intended scope. Figures showing blockdiagrams also identify corresponding methods as well as apparatus. Alltransmitted signals shown in the Figures can be applied to various typesof systems, such as cable modem channels, digital subscriber line (DSL)channels, individual orthogonal frequency division multiplexed (OFDM)sub-channels, wireless channels, SM and MIMO channels, optical channelsand the like. In general, more than two component codes can beconcatenated together, and embodiments can be created that mix paralleland serial concatenation to form mixed parallel/serial concatenatedcodes. In such cases the constrained interleaving can be performed onany component-encoded or concatenated encoded bit stream to beinterleaved within the mixed encoder structure to satisfy a constraintthat is designed to jointly optimize or otherwise improve bit error rateperformance by jointly increasing a measure of minimum distance andreducing the effect of one or more dominant error coefficients of themixed encoded bit stream. The concepts presented herein can beextrapolated to these higher order cases by induction. This patentapplication contains various block diagrams and glow charts. It is to beunderstood that sub-portions of any of the block diagrams or flow chartscan be used to extract apparatus, systems and methods that correspond tojust the sub-portion of the block diagram or flow chart. Block diagramsin many cases can be indicative of all of methods, apparatus, andsystems. Also, it is understood that an inner code in a concatenationcan be replaced in many cases by a modulator such as a TCM, BICM, orCICM. That is, a serial concatenated code may be formed by an outerencoder followed by a constrained interleaver, followed by a signalmapper such as TCM, BICM, or CICM. Such embodiments of CTBC codes arecontemplated herein.

Also FIG. 27 can expressly be mixed with any of the other figures toconstruct communication systems and communication services. Also,sub-portions of any of the block diagrams or flow charts can be brokenoff and merged with other sub-portions of any other block diagrams orflow charts from one or more separate figures to arrive at otherdevices, apparatus, systems and methods. All such combinations areexpressly contemplated herein, although it would take too much space toenumerate them all. Therefore the present disclosure is to be understoodto include all such combinations of the material disclosed herein. Henceit is noted that all such embodiments and variations are contemplated bythe present invention.

Also, it is to be noted that much of the description herein relates tocomputer, digital communications, and digital signal processingtechnology, and all of the block diagrams and flowcharts and relateddescription herein can, in whole or in part, be implemented usingprocessor technology. For example, apparatus and systems can compriseone or more processors coupled to one or more memories, and also coupledto other input/output devices such as channel interfaces, lineinterfaces, communication protocol stack upper layers, user interfaces,user input/output devices, switching fabrics, OTN backbone links,optical LAN interfaces, and the like. In such systems, instructions canbe stored in the one or more memories to cause one or more functionalunits in one or more the processors to carry out actions or steps toimplement any aspects of the block diagrams or flow charts herein. Also,special hardware can be hardwired, so that no instruction stream isneeded to carry out certain actions such as highly repetitive/periodicprocessing. In such cases microsequencing logic can be built intodedicated control circuits to cause the hardware to loop through eachframe of encoding, decoding, modulation, demodulation, and the like. Theapparatus, systems and methods presented herein can be configured toperform computerized sequences of operations, however, the operationsthemselves are provided to solve problems that are necessarily rooted incomputer and electronic communications technology in order to overcomespecific problems that specifically arise in the realm of computernetworks, local area networks, wide area networks, link layercommunications, and physical layer communications. For example, errorsnaturally occur due noise, distortion, and other impairments physicallyintroduced by a communication channel. The techniques developed hereinprovide solutions to recovering a message sequence at a receiver witherror recovery and error avoidance in light of these physicaltechnology-induced channel impairments.

Finally, it is recalled that U.S. Pat. No. 8,537,919 and U.S. Pat. No.8,532,209, by the same inventors and dealing with constrainedinterleaving related technology, are incorporated herein by reference.In these incorporated-by-reference patents, CI-1 and CI-2 are presented.Likewise, a number of specific systems are presented therein, such asconstrained turbo product codes (both the outer code and the inner codesare block codes) multiple concatenations, and the like. Hence it is tobe understood that the present invention also contemplates modifying anyspecific embodiment (e.g., block diagram, flow chart, or writtendescription portion) of these incorporated-by-reference patents bymaking any modification as disclosed in the instant patent application.For example, any place a signal mapper is discussed in theincorporated-by-reference patents, CICM or a version of CICM-SM orCICM-MIMO would be used as the signal mapper. Any time BICM is mentionedin the incorporated-by-reference patents, CICM could be substituted toobtain an embodiment in accordance with a present invention. Likewise,any time CI-1 or CI-2 is mentioned in any disclosed embodiment in theseincorporated-by-reference patents, a new embodiment in accordance withthe present invention could be obtained by specifically reciting the newspecific species SRCI, CI-3, or CI-4 of the more general genus ofinventions, CI as disclosed in these incorporated-by-reference patents.

As a specific example how this would occur in practice, consider thesteps involved in the construction of TPCs (turbo product codes) thatare constructed in accordance with CI-1 which used a constrainedinterleaver design matrix with randomization along the rows and columns.Note that the CI-1 interleaver matrix ensures that every coded bit of acodeword of the OBC is fed into different codewords of the IBC (innerblock code). In addition, the randomizations along all L=k_(i) rows andthen all nρ′ columns guarantee that coded bits are placed with thehighest possible level of randomness allowing any coded bit of any OBCto be placed anywhere in the interleaved sequence u subjected to theabove constraint. In other words, TPC designed according to CI-1uniformly randomizes positions subject to the constraint that no twocoded bits of any codeword of the OBC are allowed to be fed into thesame codeword of the IBC. With that observation and with the intentionof feeding blocks of k, bits of interleaved bits into the IBC, the SRCIcounterpart can be designed by considering a block structure in theinterleaver and constraining that any two coded bits of a codeword ofthe OBC cannot be placed in the same block of k, bits of the interleavedsequence. Hence, in SR-CTPC, every coded bit c_(jt) which is at positioni=(jn+t) on c, where j=0, 1, . . . , (ρ−1), t=0, 1, . . . , (n−1), aninterleaved position π(i) can be found on u by using the followingsteps:

(a) for each q, 0≦q<p, the restricted zone is from X(p) to Y(p)(including X(q) and Y(q)) on u, where X(q)=k_(i)└π(q)/k_(i)┘ andY(q)=X(q)+k_(i)−1, and

(b) randomly select a position among the remaining vacant positions on uas π(i). In order to treat all ρ codewords in the same manner, everyselected coded bit position p (0≦p<n) of all codewords can be placed onu, one coded bit position at a time starting from p=0 and moving up top=(n−1).

As another example, consider FIG. 18 of U.S. Pat. No. 8,532,209. Usingthis figure, the present invention would include an embodiment thatcould be described as having all of blocks 1810, 1010A and 1820specifically recited to have their interleavers implemented using a SRCIsuch as CI-3 and/or CI-4. While U.S. Pat. No. 8,532,209 described the CIgenus, it only described the CI-2 and CI-2 species. Hence whileinventions in U.S. Pat. No. 8,532,209 could be described to reciteembodiments using the CI genus, the current patent applicationspecifically contemplates all recitable inventions in U.S. Pat. No.8,532,209, but with the specific new SRCI, CI-3 and CI-4 species. Alsoand alternatively, for example, in class of concatenation as shown inFIG. 18 of U.S. Pat. No. 8,532,209, the block 1820 and 1825 can beimplemented using a CICM signal mapper that comprises a CICM permutation1820 followed by a constellation mapper 1825 that acts as the inner codein a double concatenation. It is also noted that r2ρ2 in FIG. 18 of U.S.Pat. No. 8,532,209 may be set to one. All such variations andembodiments are specifically contemplated by the present invention.

In can also be noted that in all of the optical and non-optical SMembodiments discussed herein, instead of jointly SISO decoding both thespatial constellation point (i.e., the channel number) and the signalconstellation point, a two step process may be used instead. That is, afirst detector/channel estimator can be used to identify one or asequence of channel numbers through which the SM signal was transmittedin a given one or a sequence of symbol intervals, and a seconddetector/decoder can be used to estimate one or a sequence of signalconstellation points. For example, the SISO decoder can be broken upinto first and second SISO decoders (spatial constellation decoderfollowed by signal constellation decoder), or other types ofarrangements can be used. For example, a hard decoder or a harditerative decoder can be to estimate the sequence of spatialconstellation points (sequence of channel numbers) and a SISO decodercan then be used to estimate the sequence of signal constellationpoints. Other types of arrangements that iterate between these twodecoders can also be configured. Also, a channel pre-estimator portioncould be used to narrow the search space and simplify the complexity ofthe SISO decoder. For example, if there are a total of 256 possiblechannels through which the SM signal can be transmitted each symbolinterval, the channel pre-estimator could be configured to identify asequence of the 16 or fewer most likely channels through which the SMsignal was transmitted each symbol interval over a given frame interval.A reduced-complexity joint spatial-signal constellation SISO decodercould then be configured to use this channel pre-estimation informationto narrow the search space when iterating to jointly find the sequenceof spatial and signal constellation points that were transmitted duringeach symbol interval. To simplify further, for example, if out of the 16channel estimates, only four are above a threshold during a given symbolinterval, the joint SISO decoder could reduce its complexity further byonly operating on the metrics that are above the threshold. All suchvariations are contemplated by the present invention.

It is therefore noted that any specific embodiment recited in anyspecifically drafted claim is what governs the claim scope of allrecited claims in this application and any continuations, divisionals,or international filings derived herefrom. The disclosure providedherein is meant to explain how to construct all of these voluminousdifferent types of embodiments and to explicitly show one of ordinaryskill in the art how to readily construct them using standard levels ofengineering creativity and engineering know-how as would be expected byone of ordinary skill in the art. The recited claims are provided toidentify the scope of the claimed inventions.

What we claim is:
 1. A communications apparatus comprising, an encoderthat converts a sequence of bits to an encoded bit sequence inaccordance with an encoding rule, wherein the encoded bit sequencecontains an integer number, K, of encoded bits, and the encoding rulehas the property that, for all possible sequences of bits to be encoded,all possible low weight encoded bit sequences, i_(P), of weightsd_(t)≦d≦d_(f) can be identified and enumerated, none of the possible lowweight encoded bit sequences, i_(P), can have a weight less than d_(t),and the weights d_(t)≦d≦d_(f) correspond to Hamming distances; aconstrained interleaver configured to implement a permutation rule thatpermutes the K encoded bits of the encoded bit sequence to a sequence ofsubsets, wherein each subset contains m encoded bits; a constellationmapper coupled to receive the sequence of subsets and configured toconstellation-map the sequence of subsets to a sequence of signalconstellation points in accordance with a constellation mapping rule;wherein the permutation rule and the constellation mapping rule arejointly selected to ensure that a pre-defined target value of MSED(minimum squared Euclidian distance) is maintained for all of thepossible low weight encoded bit sequences, i_(P), of weightsd_(t)≦d≦d_(f).
 2. The communications apparatus of claim 1, wherein thesequence of subsets includes K/m number of subsets.
 3. Thecommunications apparatus of claim 2, wherein the sequence ofconstellation points includes K/m number of 2^(m)-ary signalconstellation points.
 4. The communications apparatus of claim 1,wherein the permutation rule and the constellation mapping rule arefurther jointly selected to ensure that a pre-defined target value ofminimum symbol Hamming distance, d_(s), is maintained for all of thepossible low weight encoded bit sequences, i_(P), of weightsd_(t)≦d≦d_(f).
 5. The communications apparatus of claim 4, wherein theHamming distance d_(f) is selected such that any possible encoded bitsequence, i_(P), that has a weight d>d_(f) will be guaranteed to have atleast the pre-defined target value of MSED and the pre-defined targetvalue of minimum symbol Hamming distance, d_(s).
 6. The communicationsapparatus of claim 4, wherein constellation mapper uses an anti-Graycoding constellation mapping rule.
 7. The communications apparatus ofclaim 4, wherein constellation mapper uses an RGC (Reverse Gray coding)constellation mapping rule.
 8. The communications apparatus of claim 1,wherein the Hamming distance d_(f) is selected such that any possibleencoded bit sequence, i_(P), that has a weight d>d_(f) will beguaranteed to have at least the pre-defined target value of MSED.
 9. Thecommunications apparatus of claim 1, wherein the constellation mapperuses an RGC (Reverse Gray coding) constellation mapping rule.
 10. Thecommunications apparatus of claim 1, wherein constellation mapper usesan anti-Gray coding constellation mapping rule.
 11. The communicationsapparatus of claim 1, further comprising a spatial mapper, wherein thecombination of the constellation mapper and the spatial mapper comprisea constellation and spatial mapper configured to couple the sequence ofsignal constellation points through a sequence of selected ones of aplurality of spatial channels in accordance with a spatial modulationrule.
 12. The communications apparatus of claim 11, wherein the sequenceof selected ones of a plurality of spatial channels is selected toensure that the pre-defined target value of MSED (minimum squaredEuclidian distance) is maintained for all of the possible low weightencoded bit sequences, i_(P), of weights d_(t)≦d≦d_(f) that traverse theplurality of spatial channels.
 13. The communications apparatus of claim11, wherein the each one of the plurality of spatial channels comprisean output from a different one of a set of antennas.
 14. Thecommunications apparatus of claim 11, wherein the each one of theplurality of spatial channels comprise an path through from a differentone of a set discrete-time optical filters in a bank of discrete-timeoptical filters.
 15. The communications apparatus of claim 1, furthercomprising a spatial mapper, wherein the combination of theconstellation mapper and the spatial mapper comprise a constellation andspatial mapper configured to couple the sequence of signal constellationpoints through a plurality of spatial channels in accordance with a MIMO(multiple input multiple output) modulation rule.
 16. The communicationsapparatus of claim 1, wherein the encoder further comprises: an outerencoder that is coupled to receive the sequence of bits and to producetherefrom a sequence of outer-encoded bits in accordance with member ofthe group consisting of a block code, a finite-length convolutional codeand an LDPC (low density parity check) code; a second constrainedinterleaver that is coupled to receive the sequence of outer-encodedbits and is operable to produce a permuted sequence of outer encodedbits subject to one or more constraints that ensure that none of thepossible low weight encoded bit sequences, i_(P), can have a weight lessthan d_(t); an inner coder that is coupled to receive the permutedsequence of outer encoded bits and to produce a sequence ofinner-encoded bits in accordance with an inner code that is a recursiveconvolutional code; wherein encoded bit sequence corresponds to thesequence of inner-encoded bits.
 17. The communications apparatus ofclaim 16, wherein the one or more constraints include at least one SRCI(single row constrained interleaving) constraint.
 18. The communicationsapparatus of claim 17, wherein the SRCI constraint is a member of thegroup consisting of a CI-3 constraint and a CI-4 constraint.
 19. Thecommunications apparatus of claim 18, wherein the one or moreconstraints further include a vectorization constraint and the secondconstrained interleaver is a contention free deterministic constrainedinterleaver.
 20. The communications apparatus of claim 1, wherein theencoder is configured to encode in accordance with a CTBC (constrainedturbo block convolutional) code.
 21. The communications apparatus ofclaim 1, wherein the encoder is configured to encode in accordance witha block code.
 22. The communications apparatus of claim 1, wherein theencoder is configured to encode in accordance with a convolutional code.23. The communications apparatus of claim 1, wherein the encoder isconfigured to encode in accordance with a turbo code for which allpossible low weight encoded bit sequences, i_(P), of weightsd_(t)≦d≦d_(f) can be identified and enumerated.
 24. The communicationsapparatus of claim 1, wherein the encoder is configured to encode inaccordance with an LDPC code for which all possible low weight encodedbit sequences, i_(P), of weights d_(t)≦d≦d_(f) can be identified andenumerated.